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Introduction* 

P.G.  Drazin®  and  G.P.  King'’ 

‘Department  of  Mathematics,  University  of  Bristol,  Bristol,  UK 
'’Department  of  Mathematics,  University  of  Warwick,  Coventry,  UK 


This  is  a  report  of  the  proceedings  of  the  lUTAM  Symposium  and  NATO  Advanced  Research  Workshop  on  "The 
Interpretation  of  Time  Series  from  Nonlinear  Mechanical  Systems",  held  at  the  University  of  Warwick.  England,  from 
26-30  August  1991.  It  contains  a  brief  and  partial  review  of  the  state  of  the  art  of  the  subject,  and  an  account  of  a  few 
highlights  of  the  Symposium.  This  paper  also  introduces  the  papers  which  follow  in  this  volume,  papers  which  report  some 
of  the  contributions  to  the  Symposium. 


1.  Introduction 

Whether  a  time  series  is  produced  by  a  linear, 
chaotic  or  stochastic  process,  the  aims  of  its 
interpretation  are  the  same: 

(1)  to  detect  useful  and  interesting  patterns  by 
exploring  the  data; 

(2)  to  construct  a  model  by  using  the  data  and 
as  much  other  knowledge  of  the  process  as  pos¬ 
sible;  and 

(3)  to  verify  that  the  model  can  both  repro¬ 
duce  and  predict  the  patterns,  and,  if  necessary, 
to  improve  the  model  further. 

Statisticians  and  communications  engineers 
have  analysed  time  series  for  several  decades, 
using  primarily  linear  methods  of  analysis,  such 
as  Fourier  transforms  and  filters.  The  explosive 
growth  of  research  in  nonlinear  systems  in  the 
197()s  and  1980s  led  to  new  goals  of,  and  new 
approaches  to,  the  interpretation  of  time  series. 
The  need  to  use  observations  of  the  output  of  a 
nonlinear  system,  whether  the  system  is  in  na¬ 
ture,  a  laboratory  or  a  computer,  in  order  to 

*  This  work  relates  to  Department  of  the  Navy  Grant 
N00014-91-J-9049  issued  by  the  Office  of  Naval  Research 
European  Office.  The  United  States  has  a  royalty-free  license 
throughout  the  world  in  all  copyrightable  material  contained 
herein. 


ascertain  the  character  of  the  system,  and  the 
need  to  predict  the  future  behaviour  of  the  sys¬ 
tem,  drove  both  theoreticians  and  experimental¬ 
ists  to  devise  new  methods  to  interpret  time 
series  (and  sometimes  to  re-discover  old  meth¬ 
ods).  It  drove  them  to  seek  to  reconstruct  the 
phase  space  rather  than  use  the  tools  of  stochas¬ 
tic  and  linear  analysis. 

The  discovery  of  deterministic  chaos,  with  the 
appearance  of  randomness,  directed  research 
along  lines  somewhat  similar  to  those  followed 
by  statisticians  earlier,  although  different  kinds 
of  information  from  the  data  were  sought.  If  the 
system  has  homed  in  on  an  attractor,  it  is  valu¬ 
able  to  reconstruct,  from  the  time  series  alone, 
the  fractal  dimension  and  other  topological  in¬ 
variants  of  the  attractor,  the  attractor  in  its  phase 
space,  or  even  the  governing  differential  equa¬ 
tions  themselves.  Early  claims  of  successful  anal¬ 
ysis  evoked  the  dream  of  putting  a  thermometer 
outside  one’s  window,  measuring  the  tempera¬ 
ture  at  regular  intervals,  and  finding  the  dy¬ 
namics  of  the  whole  atmosphere,  so  that  the 
future  climate  would  be  predictable.  This  vio¬ 
lates  physical  intuition,  and  violates  it  for  good 
reasons. 

Chaos  is  fascinating,  beautiful  and  intriguing, 
but  also  dangerous.  Its  temptations  are  so  great 
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that  many  scientists  think  of  it  and  see  it  where¬ 
ver  they  look.  The  danger  was  anticipated  by 
Eddington,  who  wrote  (in  an  astrophysical 
context) 

“We  have  found  that  where  science  has  pro¬ 
gressed  the  farthest,  the  mind  has  but  regained 
from  nature  that  which  the  mind  put  into 
nature.  We  have  found  a  strange  foot-print  in 
the  shores  of  the  unknown.  We  have  devised 
profound  theories,  one  after  the  other,  to 
account  for  its  origin.  At  last,  we  have  suc¬ 
ceeded  in  reconstructing  the  creature  that 
made  the  foot-print.  And  lo!  It  is  our  own”. 

This  is  uncomfortably  close  to  a  description  of 
some  modern  work,  in  particular  the  calculation 
of  fractal  dimension.  Many  difficulties  of  inter¬ 
preting  time  series  from  a  chaotic  system  became 
apparent  in  the  1980s.  The  difficulty  of  keeping 
even  a  carefully  controlled  laboratory  system 
stationary,  the  difficulty  of  measuring  data  over 
long  enough  times,  the  difficulty  of  making  mea¬ 
surements  frequently  enough,  the  difficulty  of 
making  measurements  accurately  enough,  and 
the  difficulty  of  distinguishing  chaos  from  noise, 
were  recognized  as  imposing  severe  practical 
limitations  on  methods  of  interpreting  time 
series. 

These  developments,  some  fast  moving,  made 
it  timely  to  hold  an  international  meeting  on  the 
topic. 

2.  The  Symposium 

The  Symposium  set  out  to  examine  critically 
modem  methods  and  to  ensure  that  reconstruc¬ 
tion  of  chaotic  states  will  reveal  nature’s  foot¬ 
prints,  not  our  own.  We  sought  how  to  learn 
about  any  physical  system  of  interest  from  mea¬ 
surements  of  the  system  which  are  contaminated 
by  extraneous  physical  effects.  We  concluded 
that  we  are  nearing  the  stage  when  several  al¬ 
gorithms  will  be  widely  available  for  use  by  those 
with  no  specialist  knowledge  of  the  theory  of 


dynamical  systems,  and  that  these  algorithms  will 
not  confuse  chaos  with  correlated  noise.  In  dis¬ 
cussion  some  said  that  some  makers  of  al¬ 
gorithms  had  been  unjustly  blamed  for  poor 
results  by  those  who  had  misused  the  algorithms 
because  they  misunderstood  the  theory.  This 
emphasizes  that  the  algorithms  will  not  be  magic 
solutions  of  all  nonlinear  problems,  but  will  re¬ 
quire  careful  application  and  merely  be  added  to 
the  repertoire  of  scientific  methods,  which  re¬ 
quire  thoughtful  modelling  and  verifying.  In 
short,  common  sense  is  essential. 

The  method  of  delays  is  fundamental  to  much 
of  the  work  reported  at  the  Symposium.  This 
method  was  proposed  independently  by  Packard, 
Crutchfield,  Farmer  and  Shaw  [1],  Takens  [2], 
and  Ruelle  in  a  private  communication  (cf.  ref. 
[3]),  whereby  time  series  of  a  few  state  variables 
are  used  to  generate  a  large-dimensional  vector. 
For  example,  with  a  single  time  series  y{t)  sam¬ 
pled  N  +  \  times  at  intervals  t,  i.e.  at  t  =  0,  t, 
2t,  ...,At,  the  /-vector  y„  =  [>’(nT),  y((/-l- 
1)t),  ...,>»((« -H /- can  easily  be  con¬ 
structed.  Takens  went  on  to  use  the  Whitney 
embedding  theorem  to  show  that  the  method  of 
delays  can  be  used  to  reconstruct  finite-dimen¬ 
sional  chaotic  attractors.  Tong  told,  or  re¬ 
minded,  the  Symposium  that  these  ideas  are 
rooted  in  the  scatter  diagram  used  by  Yule  [4]  in 
1927  and  its  subsequent  developments  by  statisti¬ 
cians.  The  essential  novelty  in  these  methods  lies 
in  reconstructing  the  dynamics,  or  at  least  topo¬ 
logical  invariants  of  the  dynamics,  of  the  system 
which  has  produced  the  data. 

So,  although  the  idea  of  constructing  a  delay 
vector  was  not  new,  Takens’  [2,  p.  371]  embed¬ 
ding  theorem  was.  It  is  so  fundamental  to  many 
of  the  papers  in  this  volume  that  we  state  it  here, 
albeit  informally.  It  is  assumed  that  a  smooth 
dynamical  system  has  an  attractor  in  a  compact 
manifold  M  of  dimension  m,  and  that  the  observ¬ 
able  y:  Af— >IR  is  determined  smoothly  from  any 
given  state  x  of  the  system  and  measured  at  time 
t  =  nT.  Thus  y  =  y{x{nT))  as  jc:  R—*M  evolves, 
i.e.  as  the  system  changes  with  time.  Then 
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Takens’  theorem  states  that  it  is  a  generic  prop¬ 
erty  that  defined  by 

^yix) 

=  [yWO),  yW'  +  t)),  . . . ,  y(x(t  -I-  2mT))Y 

is  an  embedding.  (Mane  [5,  p.  232]  proved  a 
similar  embedding  theorem  in  which  M  is  a 
compact  subset  of  a  Banach  space  and  the  em¬ 
bedding  dimension  2m  -t- 1  is  the  least  integer 
greater  than  or  equal  to  one  plus  twice  the 
Hausdorff  dimension  of  M.)  Of  course,  measure¬ 
ments  of  the  series  y(x{nt))  do  not  tell  us  directly 
what  the  unknown  dynamical  system,  attractor, 
manifold  M  and  dimension  m  are,  but  Takens’ 
theorem  indicates  that  the  dimension  of  the 
delay  vector  may  have  to  be  as  large  as  2m  +  1 
before  we  can  be  assured  that  it  is  sufficient  for 
the  reconstruction  of  the  topological  behaviour 
of  the  attractor.  For  more  recent  work  on  em¬ 
bedding  see,  e.g.,  Sauer,  Yorke  and  Casdagli  [6]. 

This  reconstruction  is  an  ideal  unrealizable  in 
practice,  because  measurements  are  of  finite  du¬ 
ration  and  noisy.  One  method  to  reduce  noise  is 
to  choose  an  appropriate  basis  of  vectors  to 
represent  the  time  series,  the  vectors  being  de¬ 
termined  by  the  data  themselves.  In  fact  the 
basis  is  chosen  to  be  the  set  of  eigenvectors  of 
the  two-point  correlation  matrix  computed  from 
the  series.  The  method  was  given  different 
names  by  different  lecturers.  It  was  originated  by 
statisticians,  who  call  it  principal  component 
analysis.  Others  call  it  the  method  of  singular- 
value  decomposition,  singular  system  analysis, 
singular  spectrum  analysis,  bi-orthogonal  decom¬ 
position,  or  proper  orthogonal  decomposition;  or 
call  the  vectors  Karhunen-Loeve  vectors  or 
empirical  orthogonal  functions.  The  method 
emerged  as  a  tool  in  many  lectures.  We  suggest 
that  the  name  singular-value  decomposition  is 
better  reserved  as  the  title  of  the  method  of 
factorizing  a  matrix,  a  method  which  dates  back 
to  Beltrami’s  work  in  the  nineteenth  century  (cf. 
ref.  [7]),  because  it  is  only  one  part  of  the 
method  of  noise  reduction. 


Another  tool  of  increasing  use  is  the  wavelet 
transform.  It  is  a  generalization  of  the  Fourier 
transform  which  represents  the  translation  as 
well  as  the  scaling  of  components  of  a  signal. 
Arneodo’s  lecture  was  a  masterly  review  of 
wavelets,  and  Grossman’s  an  example  of  their 
modern  use. 

A  prevalent  theme  of  the  Symposium  was  the 
hard  task  of  distinguishing  between  the  signal 
and  noise,  between  chaos  and  extraneous  effects 
or  errors.  We  could  not  agree  on  a  definition  of 
noise,  but  we  did  agree  that  it  is  important.  The 
problem  of  determining  what  portion  of  a  time 
series  is  due  to  deterministic  chaos  and  what  to 
noise  was  the  subject  of  several  lectures  and 
many  animated  discussions.  New  approaches  to 
this  problem  are  described  in  the  papers  in  the 
section  Chaos  or  noise?  of  this  volume. 

Concepts  of  phase  space  and  time  delay  recon¬ 
struction  have  led  to  new  approaches  to  model¬ 
ling  time  series  from  nonlinear  systems.  Applica¬ 
tions  to  prediction,  control,  noise  reduction,  fil¬ 
tering  and  signal  extraction  were  described  in 
several  lectures. 

Another  recurring  theme  in  the  Symposium 
was  the  parallel  developments  of  the  interpreta¬ 
tion  of  time  series  by  theoreticians  of  statistics, 
signal  processing  and  dynamical  systems. 
Theoreticians  have  belatedly  seen  how  to  apply 
the  tools  of  statistics  and  signal  processing  to  the 
interpretation  of  time  series  from  real  dynamical 
systems.  Conversely,  the  concepts  of  determinis¬ 
tic  chaos,  phase  space,  fractal  dimension  etc.  are 
beginning  to  lead  to  new  developments  of  statis¬ 
tics  and  signal  processing. 

The  Symposium  was  dominated  by  specialists 
in  dynamical  systems,  but  the  statisticians  were 
represented  eloquently  and  cogently  by  Tong. 
He  reminded  us  of  the  priority  of  statisticians  in 
making  many  discoveries,  and  illuminated  sever¬ 
al  lectures  with  a  statistical  point  of  view.  The 
interest  of  the  statistical  community  in  chaos  in 
general,  and  in  the  subject  of  the  Symposium  in 
particular  is  aptly  marked  by  Chaos  Day,  a 
meeting  which  was  organized  by  the  Royal 
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Statistical  Society  on  16  October  1991,  so  close 
to  the  Symposium.  The  reader  may  wish  to  see 
the  proceedings  (8]  of  that  meeting. 

Until  recently,  low-dimensional  systems  have 
been  emphasized  in  nonlinear  dynamics.  Now 
interpretation  of  time  series  from  spatially  ex¬ 
tended  systems  is  being  developed  rapidly.  Ex¬ 
cellent  examples  are  found  in  spatiotemporal 
chaos,  and  in  fluid  mechanics,  where  we  see  a 
low-dimensional  system,  then  spatiotemporal 
chaos  and  finally  turbulence  as  a  fluid  system  is 
driven  more  strongly.  Several  papers  in  this  vol¬ 
ume  illuminate  this;  also  there  were  some  valu¬ 
able  lectures  on  identification  of  coherent  struc¬ 
tures  in  turbulence  (by  Otaguro  and  Sato,  by  van 
de  Water,  and  by  Vassilicos).  But,  of  course, 
applications  of  the  interpretation  of  time  series 
to  many  other  fields,  e.g.  the  instrumentation  of 
analysis  of  signals  in  real  time  (by  Namajunas 
and  Tamasevicius),  are  equally  valuable. 

It  was  encouraging  also  to  hear  lectures  and  to 
see  posters  on  interpreting  time  series  of  natural 
phenomena  in  the  earth  and  life  sciences.  They 
offer  new  insights  into  some  very  important  and 
useful,  but  formidable,  problems. 

Our  original  intention  was  to  build  a  data  bank 
of  time  series  to  make  it  available  world  wide  for 
analysis,  and  to  hold  a  competition  for  particip¬ 
ants  to  apply  their  algorithms  on  the  same  sets  of 
data.  However,  our  plans  were  overtaken  by 
events.  For  the  Santa  Fe  Institute  has  set  up 
similar  arrangements  and  fortunately  has  the  re¬ 
sources  to  sustain  the  programme  at  a  higher 
level  and  for  a  longer  time  than  we  could.  It  will 
run  a  workshop  in  May  1992,  and  retain  a  reposi¬ 
tory  of  interesting  data  and  analysis  programs 
thereafter.  For  further  information,  the  inter¬ 
ested  reader  may  write  to  Andreas  Weigend  or 
Neil  Gershenfeld  at  the  Santa  Fe  Institute,  1660 
Old  Pecos  Trail,  Santa  Fe,  NM  87501,  USA. 

Nonetheless,  the  participants  at  the  Sym¬ 
posium  not  only  took  an  evening  to  watch  videos 
of  computer  and  laboratory  experiments,  but 
also  worked  on  computers  to  demonstrate  al¬ 
gorithms  and  exchange  software. 

There  were  66  participants  at  the  Symposium, 


joined  by  a  few  observers  and  several  young 
scientists.  The  names  of  the  participants,  the 
observers,  the  young  scientists,  and  their  lectures 
and  posters  are  listed  at  the  end  of  this  volume. 

This  is  our  description  of  interpretation  of 
time  series  from  nonlinear  systems  and  of  the 
proceedings  of  the  symposium,  but  the  best  de¬ 
scription  of  the  state  of  the  art  of  interpretation 
of  time  series  is  the  lectures,  posters  and  discus¬ 
sion  at  the  symposium  itself  and  the  papers 
which  follow  in  this  volume.  The  papers  are 
grouped  in  sections  by  subject.  The  subjects 
merge  with  one  another,  so  our  intention  in  the 
grouping  is  to  help  the  reader  rather  than  belittle 
the  papers. 
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Backpropagating  neural  networks  are  used  to  reconstruct  the  attractors  of  two  low-dimensional  chaotic  systems  using 
small  input  sets  of  noise-corrupted  data.  The  nets  are  able  to  reconstruct  attractors  that  are  visually  similar  to,  and  have  the 
same  correlation  dimensions  as,  attractors  constructed  from  noise-free  data. 


].  Introduction 

In  recent  years,  a  vocabulary  for  the  quantita¬ 
tive  characterization  of  chaos  has  been  de¬ 
veloped,  and  has  been  used  to  describe  and 
analyze  an  incredible  variety  of  phenomena  in 
practically  all  fields  of  science  and  engineering 
(see  e.g.  ref.  [1]).  Procedures  for  the  calculation 
of  the  quantities  that  make  up  this  vocabulary  - 
dimensions,  entropies,  Lyapunov  exponents, 
topological  indices,  etc.  -  typically  presume  the 
availability  of  copious  amounts  of  data  measured 
with  high  precision,  and  with  minimal  noise  con¬ 
tamination.  Unfortunately,  this  is  rarely  the  case 
with  data  from  real  experiments  which  usually 
provide  only  noise-corrupted  data  sets  of  limited 
size  and  limited  precision.  Thus,  when  con¬ 
fronted  with  a  data  set  that  is  known  to  be 
noise-contaminated,  one  often  cannot  just  ask, 
“is  it  chaos  or  is  it  noise?”.  Rather,  one  is  forced 
to  ask,  “is  it  just  noise,  or  is  there  some  poten¬ 
tially  meaningful  dynamical  information  being 
masked  by  the  noise?”.  Before  any  of  the  power¬ 
ful  nonlinear  dynamical  techniques  of  analysis 


can  be  applied,  the  effects  of  noise  need  to  be 
eliminated  or  at  least  minimized. 

The  most  obvious  procedure  for  minimizing 
the  effects  of  noise  is  by  filtering.  There  is  in¬ 
creasing  evidence,  however,  that  filtering 
changes  the  values  of  some  nonlinear  dynamical 
indices,  notably  the  correlation  dimension.  Some 
types  of  filters  increase  the  system’s  calculated 
dimension  [2]),  others  decrease  it  [3]).  There  are 
some  chaotic  systems  -  the  Henon  map  is  an 
example  -  which  have  remarkably  flat  spectra  so 
that  any  filtering  is  sure  to  have  serious  effects 
on  the  dynamical  information  contained  by  a 
time  series  generated  by  the  system.  Convention¬ 
al  filters  should  be  used  with  the  greatest  of  care, 
if  at  all. 

There  are  other  noise-reduction  techniques 
that  do  not  rely  on  conventional  digital  filtering 
[4].  Some  of  these  make  use  of  singular  value 
decomposition  or  principal  component  analysis 
and  have  been  shown  to  be  effective  for  relative¬ 
ly  low  noise  levels. 

Neural  networks  present  another  possibility. 
Recently,  there  has  been  considerable  progress 
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in  the  use  of  neural  networks  to  recognize  spatial 
as  well  as  temporal  patterns  [5-7],  The  use  of 
neural  networks  in  this  context  has  capitalized  on 
their  ability  to  generalize  as  well  as  on  their 
remarkable  fault  tolerance.  The  first  of  these 
properties  means  that  they  can  successfully  clas¬ 
sify  patterns  that  have  not  been  previously  pre¬ 
sented;  the  second  means  that  they  can  recognize 
corrupted  patterns  [5]).  The  first  characteristic 
makes  them  candidates  for  the  analysis  of  non¬ 
linear  signals  as  has,  in  fact,  been  done  by 
Lepedes  and  Farber  [6].  The  second  suggests 
that  they  may  remain  effective  even  in  cases 
when  the  signal  is  corrupted  by  noise. 

In  this  contribution  we  present  evidence  that 
neural  networks  may,  indeed,  be  able  to  extract 
geometric  information  about  the  attractors  of 
some  chaotic  systems  from  noise-corrupted  data. 
In  the  following,  we  specify  the  relative  amount 
of  noise  by  the  signal-to-noise  ratio  (SNR)  in 
decibels,  defined  by  SNR  =  10  log(signal  var¬ 
iance/noise  variance).  We  will  be  considering 
situations  for  which  SNR  =  lOdB  -  i.e.  when  the 
variance  of  the  noise  is  as  large  as  that  of  the 
signal. 

In  section  2  we  present  some  details  of  the 
backpropagation  algorithm  and  describe  the  sim¬ 
ple  networks  we  use.  In  section  3  we  present 
results  on  the  use  of  these  networks  to  analyze 
chaotic  time  series  calculated  from  the  logistic 
equation  and  from  the  Henon  equations.  The 
analysis  involves  both  “clean”  data  (i.e.  values 
calculated  from  the  equations  of  motion)  and 
“noisy”  data  obtained  by  adding  Gaussian  ran¬ 
dom  noise  to  the  clean  data  to  get  a  signal-to- 
noise  ratio  (SNR)  of  10  dB.  The  performance  of 
the  nets  is  judged  by  their  ability  to  reconstruct 
attractors  that  are  visually  similar,  and  have 
essentially  the  same  correlation  dimensions,  as 
the  systems  being  studied.  The  results  show  that 
the  networks  are  capable  of  satisfying  these 
criteria  even  when  using  remarkably  small, 
noise-corrupted  data  sets  as  inputs. 

Section  4  summarizes  the  results  and  discusses 
some  problems.  One  set  of  problems  arises  from 


the  very  same  fault-tolerance  which  is  one  of  the 
greatest  strengths  of  neural  networks.  An  aspect 
of  this  property  is  that  once  trained,  a  network 
can  still  function  satisfactorily  even  if  some 
nodes  are  damaged  [5].  This  means  that  there  is 
no  unique  relationship  between  the  architecture 
of  the  network,  or  its  internal  state,  and  the  task 
which  it  has  learned  to  perform,  or  the  time 
series  it  has  learned  to  predict.  This  points  to  the 
need  to  develop  techniques  that  would  make 
neural  networks  perform  more  robustly  (see,  in 
this  regard,  the  paper  by  de  Groot  and  Wiirtz 
elsewhere  in  this  volume  [12]),  and  to  define 
criteria  that  could  be  used  for  optimizing  their 
performance. 


2.  Backpropagating  neural  networks 

Backpropagating  neural  networks  have  been 
used  to  predict  chaotic  time  series  using  presum¬ 
ably  noise-free  data  as  input  [6].  Figure  1  shows 
a  neural  net  of  the  type  we  use  in  this  work:  M 
input  units,  H  units  in  a  single  hidden  layer,  and 
one  output  unit.  Inputs  and  outputs  of  all  units 
are  real  numbers.  A  time  series,  {jf(A:),  k  = 
1, .  .  .  ,  N}  is  used  as  a  training  set.  Using  the 
architecture  shown  in  fig.  I,  M  successive  values 
of  the  time  series,  x{k  -  M), .  .  .  ,  x{k  -  2), 
x{k  -  1),  are  used  to  predict  the  next  value, 
jc(^).  These  inputs  are  weighted  with  weights. 

Output 

Hidden 
layer 

Inputs 

X(k-M)  X(k-l) 

Fig.  I.  Backpropagating  neural  network  with  M  inputs,  one 
hidden  layer,  and  one  output. 
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a  bias,  is  added  and  the  result  is  fed  to 
a  hidden  unit.  The  input  to  the  yth  hidden  unit 
for  this  set  of  inputs  is  thus 

M 

/(fe),.  =  +  2  w'XA:  -  0  ■  (2.1) 

1  =  1 

The  superscript  (h)  means  that  the  quantities 
pertain  to  the  hidden  units.  The  activation  func¬ 
tion  of  the  hidden  units  is  a  sigmoid  -  that  is, 
with  eq.  (2.1)  as  input,  the  output  of  the  yth 
hidden  unit  is 

0(k)^  =  0.5(1.0  +  tanh  I{k)^) .  (2.2) 

This  is  sometimes  referred  to  as  a  “threshold” 
function  or  an  “activation”  function.  One  may 
use  other  functions.  One  form  that  is  widely  used 
is  0(jc)  =  1/(1  -I-  exp(-/cjt)),  which  is  also  called 
sigmoid  by  some  [5],  or  the  “logistic  function” 
by  others  [7].  It  may  also  be  symmetrized  so  that 
its  range  is  (-1,1)  rather  than  (0, 1)  as  in  the 
above.  Some  of  the  details  of  the  threshold 
function  may  not  be  crucial,  but  it  is  essential  for 
the  function  to  be  monotonic,  bounded  from 
above  as  well  as  from  below,  and  be  everywhere 
differentiable  [5,7]. 

The  output  of  the  net  is  a  biased  and  weighted 
sum  of  the  hidden  unit  outputs: 

=  <>0(*),  ,  (2.3) 

/-I 

the  superscript  (o)  on  the  weights  and  biases 
signifying  that  these  refer  to  the  output  unit. 

The  net  is  trained  as  follows:  (1)  start  with  a 
random  set  of  weights  and  biases,  (2)  calculate 
the  outputs  for  all  possible  input  sets,  (3) 
evaluate  the  “error” 

N' 

Q=I  lx(k)-x(klj\  (2.4) 

*=i 

where  JV'  is  the  total  number  of  input  sets, 
N'  =  N  -  M  -  The  crucial  step  is:  (4)  adjust 
the  weights  and  biases  to  minimize  the  error. 


The  name  "backpropagating  rule”  was  given  to 
this  procedure  by  Rumelhart  et  al.  [4,6],  be¬ 
cause  the  behavior  of  the  error  function  is  prop¬ 
agated  backward  to  adjust  the  weights  and  biases 
that  produced  it. 

The  minimization  of  the  error  is  performed  by 
means  of  the  usual  method  of  steepest  descent, 
or  a  variant  of  it  called  the  “generalized  delta 
rule”  in  the  neural  net  literature  [5-7],  The  error 
Q  is  taken  as  a  function  of  the  weights  and  biases 
which,  for  convenience,  we  collectively  denote 
by  {a,},  the  index  i  taking  on  as  many  values  as 
there  are  weights  and  biases.  When  the  parame¬ 
ter  a*,  say,  changes  by  8a^,  then  Q  changes  by 

hQ  =  {dQlda,)ha,  .  (2.5) 

where  dQlda^.  is  evaluated  using  the  current 
values  of  the  parameters.  To  insure  that  changes 
in  a*  result  in  a  decrease  in  Q,  one  chooses 

Sa*  = -7(dO/daJ  ,  (2.6) 

where  y  is  a  positive  constant  called  the  “learn¬ 
ing  rate”,  y  is  to  be  large  enough  so  that  the 
training  can  be  accomplished  in  a  reasonable 
amount  of  time,  but  small  enough  so  that  one 
does  not  “overshoot”  the  sought-after  minimum. 
As  an  attempt  to  avoid  overshooting  when  using 
relatively  large  values  of  y,  a  term  yx  8a  *  is 
sometimes  added  to  eq.  (2.6),  where  8a ^  is  the 
previous  change  in  a^  and  p,  is  another  positive 
constant  called  the  “momentum”.  As  the  name 
suggests,  inclusion  of  this  term  tends  to  keep 
changes  in  the  parameter  going  in  the  same 
direction,  regardless  of  the  direction  determined 
by  the  gradient. 

3.  Results 

When  used  for  time  series  prediction  and  mod¬ 
eling  as  in  ref.  [6],  the  performance  of  the  net  is 
judged  by  its  ability  to  predict  future  values  of 
the  time  series  given  an  appropriate  number  of 
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its  values  in  the  past.  Although  we  use  this 
criterion  in  training  the  net,  we  do  not  use  it  in 
the  evaluation  of  its  performance.  In  the  training 
phase,  we  expect  that  a  minimum  of  the  error 
function  reflects  values  of  the  weights  and  biases 
that  average  out  the  effects  of  noise,  and  that 
when  a  trained  net  is  used  for  prediction,  what  is 
predicted  are  not  future  values  of  the  noisy  data 
set,  but  of  the  signal  that  was  obscured  by  the 
noise  in  the  original  data. 

The  examples  we  study  here  are  data  from 
chaotic  systems.  This  introduces  an  additional 
complication.  Since  chaos  is  characterized  by 
sensitive  dependence  on  initial  conditions,  we  do 
not  expect  to  achieve  reliable  long-term  predic¬ 
tions,  in  part  because  of  this  sensitivity,  and  in 
part  because  we  are  dealing  with  noisy  data 
which  means  that  successive  points  in  the  time 
series  are  not  necessarily  successive  points  in  the 
same  trajectory.  We  therefore  cannot  expect  to 
get  a  reliable  prediction  of  the  system’s 
dynamics  -  that  is,  of  the  precise  temporal  evolu¬ 
tion  of  its  dynamical  variables.  Rather,  as  we 
show  below,  we  get  some  geometrical  informa¬ 
tion  on  the  system’s  attractor  -  namely,  the  form 
of  the  attractor  and  its  correlation  dimension. 

Data  sets  calculated  from  the  logistic  equation, 

-t/.  +  i  =4jc„(1 -xj  ,  (3.1) 

and  values  of  the  x  variable  calculated  from  the 
Henon  equations, 

y„^i)  =  {y„  +  1  -  lAxl,0.3x„) ,  (3.2) 

are  used  as  inputs.  Values  calculated  from  the 
above  equations  as  well  as  those  to  which  ran¬ 
dom  Gaussian  has  been  added  are  used  in  the 
subsequent  analysis.  The  parameters  in  both  of 
the  above  systems  are  those  known  to  yield 
chaotic  behavior.  Logistic  data  were  analyzed 
using  a  net  with  one  input,  four  hidden  units, 
and  one  output  while  Henon  data  were  analyzed 
with  a  net  of  two  inputs,  four  hidden  units,  and 
one  output.  Learning  rates  and  momenta  were 


between  0.0  and  1.0.  The  network  architecture 
(i.e.  the  number  of  input  and  hidden  units)  as 
well  as  values  of  the  learning  rate  and  momen¬ 
tum  were  chosen  from  among  a  few  combina¬ 
tions  to  give  the  fastest  convergence  of  Q  to  a 
minimum. 

Once  a  minimum  of  Q  is  reached,  the  net  is 
used  to  calculate  a  predicted  time  series.  In  the 
examples  presented  here,  we  calculated  1024 
points  by  iteration  starting  from  the  last  M  points 
of  the  input  time  series,  M  =  \  for  the  logistic 
data,  M  =  2  for  Henon.  Each  newly  calculated 
value  of  the  variable  is  appended  to  the  input 
time  series,  and  the  last  M  values  of  the  aug¬ 
mented  time  series  are  used  to  calculate  the  next 
entry.  Comparison  of  the  time  series  generated 
by  the  net  with  th  >-  calculated  from  the  equa¬ 
tions  of  motion  with  the  same  initial  conditions 
showed  that  the  net’s  predictions  diverged  from 
the  values  given  by  the  equations  very  quickly. 
As  mentioned  earlier,  this  is  not  a  surprise. 
Instead  of  comparing  trajectories,  the  predicted 
time  series  are  used  to  reconstruct  the  attractors. 

3.1.  Reconstructed  attractors 

We  use  the  method  of  time  delays  [8]  to 
reconstruct  the  systems’  attractors  in  two-dimen¬ 
sional  phase  spaces  and  visually  compare  attrac¬ 
tors  constructed  from  the  input  data  with  those 
constructed  using  the  nets’  predictions.  This 
comparison  is  necessarily  qualitative  and  subjec¬ 
tive,  but  is  very  much  in  the  spirit  of  the  use  of 
neural  networks  for  pattern  recognition.  There 
is,  however,  one  important  and  novel  difference 
in  that  the  pattern  that  is  presented  to  the  net¬ 
work  here  is  not  merely  a  spatial  configuration, 
but  dynamical  information  encoded  in  a  scalar 
time  series  that  is  used  to  reconstruct  what  may 
be  a  complex  geometrical  object  in  a  multi¬ 
dimensional  phase  space.  Indeed,  for  a  chaotic 
time  series,  one  may  be  trying  to  reconstruct  a 
fractal.  The  next  section  gives  a  quantitative 
comparison  of  the  results  discussed  here  in  quali¬ 
tative  terms. 
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Figures  2  and  3  are  plots  of  J!:„  +  ,  vs.  jr„  for  both 
systems.  Figures  2a  and  3a  show  the  familiar 
attractors  of  the  logistic  and  the  Henon  maps 
while  figs.  2b  and  3b  show  how  well  these  attrac¬ 
tors  are  obscured  with  the  addition  of  enough 
noise  to  get  a  signal-to-noise  ratio  (SNR)  of 
10  dB.  What  the  networks  are  asked  to  accom¬ 
plish  is,  first  of  all,  to  determine  that  there  are 
patterns  embedded  in  figs.  2b  and  3b  and  sec¬ 
ondly,  that  the  pattern  hidden  in  fig.  2b  is  that 
shown  in  fig.  2a,  and  that  hidden  in  fig.  3b  is  that 
shown  in  fig.  3a. 

Figures  4a  and  4b  are  reconstructions  of  the 
logistic  attractor  using  input  data  sets  of  only 
eight  points  (marked  by  crosses)  from  the 
“clean”  and  “noisy”  data  sets,  respectively. 
Here  “clean”  means  data  points  calculated  from 
the  equation  of  motion  while  “noisy”  means  a 
set  with  SNR=10dB.  Although  distorted,  the 
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Fig.  2.  Logistic  attractor:  (a)  Delay  plot,  x„,,  versus  x„,  of 
“clean”  input  data  calculated  from  the  logistic  equation,  (b) 
Delay  plot  of  “noisy”  logistic  input  data  (SNR  =  10). 
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Fig.  3.  Henon  attractor:  (a)  Delay  plot  of  "clean"  input  data 
calculated  from  the  Henon  equations,  (b)  Delay  plot  of 
"noisy"  Henon  data. 

similarity  of  the  reconstructions  with  fig.  2a  is 
obvious.  We  note  that  in  fig.  4b,  the  net  does  not 
faithfully  reproduce  the  behavior  of  the  noisy 
input  set.  Rather,  it  produces  a  smooth  shape 
that  may  represent  an  “averaging  out”  of  the 
noise. 

The  attractor  of  the  logistic  map  (fig.  2a)  is 
just  a  parabola,  and  one  might  argue  that  it 
could  more  simply  be  obtained  by  a  polynomial 
fit  involving  no  more  than  three  parameters.  The 
Henon  attractor  (fig.  3a),  however,  is  quite  a 
different  matter.  Its  form  is  not  describable  by 
an  expression  in  closed  form  that  gives  jc*  +  ,  in 
terms  of  x^.  The  network  needs  to  extract  from 
the  input  time  series,  information  equivalent  to 
that  embodied  in  the  two  coupled  difference 
equations  of  the  Henon  map. 

Figure  5a  is  a  reconstruction  of  the  Henon 
attractor  using  eight  values  of  clean  data.  Even 
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Fig.  4.  Reconstructed  logistic  attractor:  (a)  using  8  points  of 
clean  input;  (b)  using  8  points  of  noisy  input. 


with  such  a  small  input  set,  the  net  has  obviously 
captured  visually  recognizable  details  of  the  at¬ 
tractor  with  remarkable  fidelity.  This,  however, 
it  cannot  do  using  eight  points  of  the  noisy  data 
set.  The  first  recognizable  reconstruction  occurs 
with  a  training  set  of  16  points  (fig.  5b).  Here, 
one  has  a  distorted,  but  still  recognizable  version 
of  the  attractor.  Increasing  the  size  of  the  train¬ 
ing  set  improves  the  net’s  performance,  as  indi¬ 
cated  in  fig.  5c  which  uses  64  points.  However,  it 
does  not  seem  to  do  as  well  with  a  training  set  of 
128  points  (fig.  5d). 

3.2.  Correlation  dimension 

To  provide  a  more  quantitative  comparison  of 
the  clean,  noisy,  and  reconstructed  attractors,  we 
calculated  their  correlation  dimensions  using  the 
Grassberger-Procaccia  algorithm  [9].  1004  val¬ 
ues  of  x„  were  used  to  create  1000  points  in  a 
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Fig  5.  Reconstructed  Henon  attractor:  (a)  using  8  points  of 
clean  input,  (b)-(d);  using  16.  64.  and  128  points,  respective¬ 
ly.  of  noisy  input. 
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five-dimensional  embedding  space: 

y, 

y2  =  (jC2,aC3,  .  . .  , 

y^i  ~  (-^Ar’  -^*+1  *  ■  •  •  ’  -^*  +  4)  ’ 

Since  the  correlation  dimensions  of  the  attractors 
are  both  less  than  2.0,  the  choice  of  a  5-dimen- 
sional  embedding  space  guarantees  satisfaction 
of  Takens’  criterion  [10]  which  requires  a  space 
of  dimension  +  1  to  embed  an  object  of 
dimension  d.  Using  1000  embedding  vectors 
makes  it  possible  to  do  the  necessary  calculations 
relatively  quickly  while  still  satisfying  the  usual 
estimate  that  one  needs  some  10''  points  to  re¬ 
solve  a  d-dimensional  attractor.  The  embedding 
vectors  y*  are  used  to  calculate  the  correlation 
integral 

.  1000 

C(r)=^  E  0(r-ly,-y,|),  (3.3) 

<■</ 

where  S  is  the  Heaviside  function,  |x|  is  the 


Euclidean  distance  operator,  and  is  the  num¬ 
ber  of  distinct  pairs  of  points  in  the  embedding 
space.  Grassberger  and  Procaccia  have  shown 
that  under  appropriate  conditions,  the  function 
In  C(r)  versus  In  r  has  a  linear  region,  called  the 
“scaling  region”.  The  slope  of  the  function  in 
that  scaling  region  is  the  correlation  dimension 
D,. 

Figure  6  shows  a  superposition  of  plots  of  the 
slope,  d  In  C(r)/d  In  r,  of  the  function  In  C{r) 
versus  In  r  for  the  logistic  map  using  clean  data, 
noisy  data,  and  reconstructions  from  noisy  data 
using  8-  and  128-point  training  sets.  Figure  7 
shows  similar  plots  for  the  Henon  map  with  the 
difference  that  the  reconstructions  use  64-  and 
256-point  training  sets  taken  from  the  noisy  data. 

We  note  first  of  all  that,  as  expected,  the  plot 
for  the  noisy  data  for  both  attractors  do  not  have 
scaling  regions  which  means  that  if  either  noisy 
data  set  is  at  all  characterizable  by  a  dimension, 
its  dimension  cannot  be  resolved  by  this  im¬ 
plementation  of  the  Grassberger-Procaccia  al¬ 
gorithm.  On  the  other  hand,  plots  for  the  clean 
data  and  for  the  reconstructed  attractors  in  fig.  6 
are  practically  coincident  in  an  easily  identifiable 
scaling  region  between  In  r=  -3  and  In  r=  -6. 


In(r) 

Fig.  6.  Correlation  dimension  of  the  logistic  attractor:  Slope  of  In  C(r)  versus  In  r  plotted  as  a  function  of  In  r.  Noisy  data: 
dot-dot-dash;  clean  data:  solid  with  open  circles;  data  reconstructed  from  a  8-point  noisy  input  set:  dashes;  data  reconstructed 
from  a  128-point  noisy  input  set:  solid. 
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Fig.  7.  Correlation  dimension  of  the  Henon  attractor:  Slope  of  In  C(r)  versus  In  r  plotted  as  a  function  of  In  r.  Noisy  data: 
dot-dot-dash;  clean  data;  solid  with  open  circles;  data  reconstructed  from  a  64-point  noisy  input  set:  dashes:  data  reconstructed 
from  a  2S6-point  noisy  input  set:  solid. 


All  three  calculations  yield  a  correlation  dimen¬ 
sion  of  approximately  1.0,  which  is  the  expected 
value  for  the  parameter  we  used.  Similar  obser¬ 
vations  hold  for  the  Henon  map  (fig.  7).  Calcula¬ 
tions  using  the  clean  data  and  reconstructions 
using  64-  and  256-point  training  sets  taken  from 
the  noisy  data  have  a  common  scaling  region 
between  lnr=s-2  and  tnr*=-6,  and  ail  three 
yield  a  correlation  dimension  of  approximately 
1.2,  which  is  well  within  10%  of  the  value  ob¬ 
tained  by  using  much  larger  data  sets  (a  rigorous 
upper  bound  for  the  correlation  dimension  of  the 
Henon  attractor  obtained  by  using  its  equations 
of  motion,  is  its  Lyapunov  dimension,  1.25826  ± 
0.00006  [11]. 

4.  Summary 

The  examples  presented  above  suggest  that  in 
trying  to  answer  the  question  posed  in  the  Intro¬ 
duction,  “is  it  just  noise  or  is  there  some  poten¬ 
tially  meaningful  information  being  masked  by 
the  noise?”,  backpropagating  neural  nets  may 
prove  to  be  useful  tools.  These  examples  show 
that  neural  nets  have  the  capability  of  extracting 


some  geometric  properties  of  the  attractors  of 
chaotic  signals  immersed  in  large  amounts  of 
noise,  and  that  they  can  do  this  extraction  using 
relatively  small  input  data  sets. 

However,  there  are  problems.  Most  notable 
among  these  is  that  the  performance  of  the 
neural  nets  is  quite  inconsistent.  The  training  of 
the  net  is  started  by  assigning  random  values  to 
the  weights  and  biases.  Different  initial  random 
weights  can  evolve  to  different  minima  of  the 
error  function,  resulting  in  different  levels  of 
performance  even  when  using  the  same  input 
data.  One  set  of  initial  weights  and  biases  might 
lead  to  a  time  series  that  reconstructs  almost  the 
entire  attractor,  another  might  reconstruct  only 
short  segments,  or  get  trapped  in  a  periodic 
motion  on  a  small  set  of  points.  This  behavior 
can  sometimes  be  alleviated  by  calculating  sever¬ 
al  sets  of  output,  each  output  starting  from  ran¬ 
domly  chosen  starting  points  in  the  input  data 
set. 

In  addition,  there  do  not  exist  consistent 
criteria  for  determining  the  optimum  architec¬ 
ture  of  the  net,  or  of  the  optimum  size  of  the 
training  set.  In  fig.  5,  the  performance  of  the  net 
seems  to  improve  in  that  it  gives  a  less-distorted 
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attractor  (although  with  less  folding)  as  the  size 
of  the  training  set  is  increased  from  16  to  64. 
Then  it  seems  to  deteriorate  when  the  size  is 
further  increased  to  128.  Calculations  with  train¬ 
ing  sets  consisting  of  256,  512  and  1024  points 
show  that  the  quality  of  the  reconstruction  again 
increases  as  one  goes  from  128  to  256,  but  then 
deteriorates  with  the  larger  training  sets. 

These  problems  are  by  no  means  unique  to  the 
application  which  we  have  presented  here,  and 
extensive  efforts  are  underway  to  alleviate  them 
[5,7].  Inspite  of  these  problems,  examples  such 
as  those  we  have  discussed  indicate  that  neural 
nets  can  become  very  powerful  tools  for  extract¬ 
ing  signals  from  noise-corrupted  data  sets,  but 
there  is  a  need  for  the  development  of  tech¬ 
niques  that  would  make  them  perform  this  task 
more  consistently,  and  for  the  definition  of  easily 
implementable  criteria  for  optimizing  their  per¬ 
formance. 
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The  results  of  recent  experimental  and  theoretical  investigations  of  the  spectral  densities  of  fluctuations  (SDFs)  of 
noise-driven  nonlinear  dynamical  systems  are  reviewed.  Emphasis  is  placed  on  the  analysis  of  the  shapes  and  intensities  of 
pteaks  in  the  SDFs.  Three  different  types  of  phenomena  are  considered.  First,  the  SDFs  of  a  class  of  monostable 
underdamped  nonlinear  systems,  in  which  the  variation  of  eigenfrequency  with  energy  is  nonmonotonic,  are  investigated. 
It  is  shown  that  they  exhibit  zero-dispersion  peaks  and  noise-induced  spectral  narrowing,  as  well  as  zero-frequency  peaks. 
Secondly,  it  is  demonstrated  that  systems  bistable  in  an  external  periodic  held  can  exhibit  supernarrow  spectral  peaks 
within  the  range  of  a  kinetic  phase  transition.  Finally,  recent  results  in  stochastic  resonance  (SR)  are  reviewed,  including 
phase  shifts,  giant  nonlinearities  for  weak  noise.  SR  for  periodically  modulated  noise  intensity,  and  high-frequency  SR  for 
periodic  attractors. 


1.  Introduction 

Spectral  densities  of  fluctuations  (SDFs)  pro¬ 
vide  an  important  means  of  characterising  phys¬ 
ical  systems,  because  they  can  be  measured  di¬ 
rectly  in  a  variety  of  experiments:  in  particular, 
the  optical  and  neutron  spectra  of  systems  in 
thermal  equilibrium  (or  quasiequilibrium)  -  one 
of  the  main  sources  of  information  about  the 
microscopic  characteristics  of  many  such  sys¬ 
tems  -  are  immediately  related  to  SDFs.  The 
investigation  of  SDFs  also  makes  possible  to 
observe  and  analyse  the  interplay  between  the 
fluctuations,  relaxation  and  nonlinearity  that  are 
inherent  to  real  macroscopic  physical  systems. 
This  interplay  provides  one  of  the  most  challeng¬ 
ing  problems  of  modern  nonlinear  physics. 

In  many  cases  of  interest,  the  physical  system 
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to  be  investigated  can  be  modelled  by  a  more  or 
less  complicated  damped  dynamical  system  that 
is  subject  to  noise.  If  the  noise  and  the  relaxation 
are  both  due  to  coupling  to  a  thermal  bath,  then 
they  will  satisfy  the  fluctuation-dissipation  rela¬ 
tions  [1]  and  the  characteristic  intensity  of  the 
noise  will  be  equal  to  the  temperature  T  of  the 
bath.  In  the  general  case,  a  nonthermal  noise  is 
also  present.  Certain  properties  of  the  systems, 
and  of  their  SDFs  in  particular,  are  highly  sensi¬ 
tive  to  the  characteristics  of  the  noise,  while 
others  are  universal  and  depend  only  weakly  on 
these  characteristics.  Both  types  of  property  are 
clearly  of  importance  in  different  contexts.  In 
what  follows,  our  main  aim  will  be  to  consider 
phenomena  exhibited  by  noise-driven  archetypal 
models.  Similar  phenomena  may  of  course  then 
be  predicted  for  real  systems,  over  a  very  wide 
range  of  contexts  in  science  and  technology, 
whenever  they  are  described  by  equations  of  the 
same  general  form  as  those  we  will  discuss. 


0167-2789/92/$05.00  ®  1992  -  Elsevier  Science  Publishers  B.V.  All  rights  reserved 


M.I.  Dykman,  P.V.E.  McClinlock 


Spectra  of  noise-driven  nonlitw 


ar  systems 


In  the  present  paper  we  outline  recent  results 
on  the  SDFs  of  relatively  simple,  although  non¬ 
trivial,  nonlinear  systems.  Emphasis  is  placed  on 
the  shapes  and  intensities  of  the  peaks  of  the 
SDF.  Three  sorts  of  effects  are  considered.  In 
section  2  we  analyse  the  shapes  of  the  peaks  for 
monostable  underdamped  nonlinear  systems  and 
investigate  effects  related  to  nonmonotony  of  the 
dependence  of  the  frequency  of  eigenvibrations 
(o{E)  on  the  energy  E  of  a  system.  Such  non¬ 
monotony  is  inherent  in  a  number  of  vibrational 
systems.  Examples  include  the  localised  vibra¬ 
tions  in  solids,  where  nonmonotony  will  arise 
provided  that  the  “stiffness”  of  the  system  in¬ 
creases  with  energy  for  small  E  (see  [2]  for  a 
review)  and  where  it  can  be  controlled  by  exter¬ 
nal  electric  field  and/or  pressure.  For  systems  of 
this  kind,  the  widths  of  the  SDF  peaks  at  first 
increase  in  the  usual  way  with  increasing  noise 
intensity,  relative  to  their  low  noise  values 
(which  are  determined  by  damping).  Surprising¬ 
ly,  however,  they  can  sometimes  decrease  again, 
by  a  large  factor,  as  the  noise  intensity  continues 
to  rise.  Moreover,  for  very  small  damping,  a 
specific  zero-dispersion  peak  can  arise  at  the 
frequency  of  the  extremum  [3]. 

In  section  3  the  SDF  is  investigated  for  bist¬ 
able  systems,  with  the  emphasis  on  bistability 
arising  in  an  external  periodic  field  where  the 
coexisting  stable  states  correspond  to  forced 
periodic  vibrations  with  different  amplitudes  and 
phases.  Bistable  systems  driven  by  a  sufficiently 
weak  noise  have  a  very  large  characteristic  relax¬ 
ation  time  that  is  given  by  the  reciprocal  prob¬ 
abilities  of  fluctuational  transitions  between  the 
stable  states.  Associated  with  this  time  is  an 
extremely  small  spectral  width  of  the  peaks  of 
the  SDF  (supemarrow  peaks)  that  arise  at  the 
frequency  of  the  driving  field  and  its  overtones, 
and  also  at  zero  frequency.  The  peaks  exhibit  a 
critical-type  behaviour  for  the  parameters  of  the 
system  lying  in  the  vicinity  of  a  kinetic  “phase 
transition”  where  the  stationary  populations  of 
the  coexisting  stable  states  are  of  the  same  order 
of  magnitude. 


In  section  4  a  phenomenon  directly  related  to 
the  aforementioned  super-narrow  peaks  is  inves¬ 
tigated,  namely,  the  onset  of  a  strong  response 
of  a  bistable  noise-driven  system  to  a  compara¬ 
tively  weak  (trial)  periodic  field  [4j  and  the 
dome-like  (bell-shaped)  dependence  of  the  re¬ 
sponse  on  the  noise  intensity  called  stochastic 
resonance  by  Benzi  et  al.  [5]  (see  also  [6]).  This 
phenomenon  has  attracted  considerable  interest 
recently  and  has  been  observed  in  both  active  [7] 
and  passive  [8]  optically  bistable  systems  and 
also  in  analogue  electronic  experiments  [9-12]. 

2.  Noise-induced  narrowing  of  the  spectral 
peaks  of  monostable  underdamped  systems 

In  view  of  its  importance  and  a  wide  variety  of 
applications,  the  problem  of  the  power  spectra  of 
nonlinear  vibrational  systems  has  been  consid¬ 
ered  by  many  authors,  both  numerically  and 
analytically  (see  refs.  [13-24]  and  the  reviews 
[4b,  25]).  Underdamped  systems,  in  particular, 
are  of  the  utmost  interest,  because  of  their  as¬ 
sociation  with  resonant  phenomena,  including, 
e.g.  resonant  light  absorption  and  neutron  scat¬ 
tering  in  condensed  matter  that  is  directly  de¬ 
scribed  just  by  the  SDFs.  It  is  generally  accepted 
that  the  peaks  of  the  SDF  usually  become  sub¬ 
stantially  broader  as  the  external  noise  intensity 
increases.  This  is  due  to  the  growth  of  fluctua¬ 
tions  in  the  system.  However,  as  is  shown  below, 
in  some  systems  the  broadening  is  followed, 
remarkably,  by  a  narrowing  of  the  peaks  with 
further  increase  of  the  noise  intensity. 

We  shall  investigate  evolution  of  the  peaks  for 
the  simplest  model  of  a  fluctuating  nonlinear 
system,  a  nonlinear  oscillator  performing  Brow¬ 
nian  motion.  It  is  described  by  the  equation 

q  +  2rq  +  U’{q)=f{t), 

(/(/)/(/')>  =4r75(/-/')  (1) 

If  its  fluctuations  correspond  to  thermal  equilib¬ 
rium,  then  T  in  (1)  is  the  temperature,  whereas 
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in  the  more  general  case  it  simply  characterises 
the  intensity  of  the  driving  noise  which,  in  the 
present  ection,  is  supposed  to  be  white  and 
Gaussian.  The  damping  F  is  assumed  small, 

r<co,,  a,,, -cu(O)  =  [(/"( (2) 

where  (»{£)  is  the  eigenfrequency  of  conserva¬ 
tive  vibrations  with  a  given  energy  £, 

(3) 


(the  energy  is  measured  from  the  value  of  the 
potential  U{q)  in  the  equilibrium  position 
Uiq.,)=U-iq,^)  =  0). 

In  what  follows  (see,  however,  section  4)  we 
shall  consider  the  SDF  of  the  coordinate  defined 
as 


Qi<o)  =  lim  (4Trro)~' 

'o 


X 


/  dt[i7(0- {<?(0;]exp(i<ur) 

->U 


The  origin  of  the  strong  noise-induced 
broadening  of  the  spectral  peak  can  easily  be 
understood  from  fig.  1.  Because  of  fluctuations, 
a  distribution  of  the  oscillator  is  formed  over  the 
energy  E.  Its  characteristic  width  is  given  by  the 
driving-noise  intensity  T.  In  its  turn,  because  of 
nonlinearity,  this  distribution  gives  rise  to  a  dis¬ 
tribution  of  the  oscillator  over  the  corresponding 
range  of  vibrational  eigenfrequencies  w{E)  i.e., 
there  arises  a  noise-induced  frequency  straggling 
8a>f,  which  for  small  noise  intensities  is  equal  to 

8wf,  =  Tla),;!,  w,;  =  [dw(£:)/d£:]£,|, , 

86)f,  «a>„.  (5) 

The  frequency  straggling  (5)  “competes”  with 
the  frequency  “uncertainty”  F  arising  from 
damping.  The  shape  of  the  peak  in  the  SDF 
'^“pends  just  on  the  ratio  of  these  two  quantities: 

a  =  |(8a>f,/r)sgn  a>;  . 


Here,  (...)  implies  the  ensemble  average 
(which  is  well  known  [1,4b]  to  differ  from  the 
time  average  for  the  periodically  driven  systems 
considered  below;  for  such  systems  the  time  axis 
is  evidently  “inhomogeneous”). 

2.1.  Peak  of  the  SDF  for  “small"  noise 
intensities 

For  very  weak  noise  (small  T)  the  oscillator 
(1)  can  be  assumed  effectively  harmonic,  with  an 
eigenfrequency  w,,  and  damping  parameter  F. 
The  SDF  Q{(o)  for  such  an  oscillator  is  well 
known  (cf.  [1])  to  have  a  Lorentzian  peak  at  the 
frequency  with  a  halfwidth  at  halfmaximum 
just  equal  to  F.  With  increasing  noise  intensity 
the  shape  of  the  peak  changes,  and  this  change 
can  be  strong  even  for  relatively  small  noise 
intensities  (which  was  probably  noticed  for  the 
first  time  in  ref.  [26]  where  the  quantum  theory 
of  the  spectra  of  localised  vibrations  was  con¬ 
sidered). 


For  arbitrary  a,  but  for  both  weak  damping, 
r  <  a>„,  and  “weak”  noise,  8a>f,  the  peak  is 


Fig.  1.  Variation  of  eigenfrequency  a){E)  with  the  energy  E 
for  a  general  nonlinear  oscillator.  If  the  oscillator  is  driven  by 
noise  of  intensity  T,  its  energy  will  be  described  by  a 
distribution  whose  width  is  approximately  equal  to  T.  so  that 
the  frequencies  of  the  thermally  excited  vibrations  are  mostly 
those  on  the  thickened  portion  of  the  curve. 
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described  by  a  comparatively  simple  expression 
[15] 


G(a>) 


T 

2'7TW,", 


Re  J  dtexp[i(w  -  a»n)f]  Q*{t) , 
0 


la>  -  <i>ol  "()  ' 


Q(r)  =  cxp(if) 


X  ^cosh(aO  +  “  ~  2i«)  sinh(flr)^ 


a  =  r(l-4ia)‘ 


It  follows  from  (6)  that  for  la|s>l,  i.e.  for 
8a)f|  >  r  it  is  fluctuational  broadening  that 
determines  the  shape  of  the  peak  in  the  SDF 
near  its  maximum.  It  also  follows  that,  in  con¬ 
trast  to  the  case  where  0(t<j)ocrr/ 

[r^  -h  {to  -  &»o)^]  is  symmetrical  near  the  maxi¬ 
mum,  for  |a!|>  1  the  peak  is  strongly  asymmet¬ 
ric.  The  shape  of  the  peak  in  the  latter  case  can 
readily  be  understood  by  noting  that  the  am¬ 
plitude  of  the  eigenvibrations  increases  with  in¬ 
creasing  energy  (as  for  small  E),  while  the 
probability  of  the  system  having  an  energy  E 
decreases  exponentially,  according  to  the  Gibbs 
law.  The  product  of  the  squared  amplitude  times 
cxp{- E/T)  is  “mapped”  onto  the  spectral  dis¬ 
tribution  Q(to)  via  the  relation  to  =  <o(E)  =  (o„  + 
to^E,  so  that  C(<w)  near  the  maximum  is  pro¬ 
portional  to  [(w  -  W(,)/w^]  exp[-(w  -  to„)/to^T]. 
The  position  of  the  maximum  itself  is  given  by 
tOg  +  Tto'g  =  to{T)  and  the  peak  increases  rapidly 
in  width  with  the  noise  intensity,  being  much 
steeper  on  the  side  of  the  sharp  low-energy 
threshold  (cf.  fig.  1). 

The  above  picture  has  been  completely  con¬ 
firmed  by  analogue  electronic  experiments 
[24,  27].  The  evolution  of  the  SDF  with  increas¬ 
ing  noise  intensity  for  an  oscillator  (1)  with  the 
potential 


(7) 


Fig.  2.  Spectral  density  Q{iv)  of  the  fluctuations  of  the 
oscillator  described  by  (1)  and  (7)  for  damping  /'  =  0.0143 
and  the  asymmetry  parameter  A=0.  as  measured  (histo¬ 
grams)  in  an  analogue  experiment  (27)  for  comparison  with 
theoretical  predictions  (curves),  for  noise  intensities:  (a)  T  = 
0.078;  (b)  0.687;  (c)  3.04. 

creases  monotonically  with  E,  as  observed  in 

[27] ,  is  shown  in  fig.  2.  The  stronger  the  noise 
the  broader  the  peak,  and  its  width  for  the 
values  of  T  in  fig.  2  substantially  exceeds  the 
relaxational  broadening  P. 

Strikingly  similar  behaviour  has  been  observed 

[28]  in  the  optical  absorption  spectra  of  localised 
and  resonant  vibrations  in  solids  as  shown,  for 
example,  by  the  results  of  fig.  3.  Just  as  in  the 


PHOTON  ENERGY  (  me  V  ) 

OJO  0.75  14)0 


FREQUENCY  (cm-') 

Fig.  3.  Temperature  dependence  of  the  far-infrared  absorp¬ 
tion  in  an  Nal  crystal  doped  with  ().4'y  NaCI  (28). 


at  A  =  0,  when  the  eigenfrequency  to{E)  in- 
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case  of  the  Duffing  oscillator  SDFs  of  fig.  2,  the 
absorption  spectrum  of  the  resonant  mode  in  fig. 
3  broadens  rapidly  and  becomes  noticeably 
asymmetric  with  increasing  temperature.  (The 
dependences  of  the  intensity  on  temperature  in 
the  two  graphs  differ  because  the  optical  absorp¬ 
tion  cross-section  in  the  experiment  of  fig.  3 
varies  approximately  as  the  SDF  divided  by  tem¬ 
perature.)  We  note  that,  in  many  physical  sys¬ 
tems,  effects  arising  from  quantum  statistics  (i.e. 
related  to  the  discreteness  of  the  energy  levels) 
of  the  localised  modes  are  important:  such  ef¬ 
fects  are  beyond  the  scope  of  the  present  review. 

2.2.  Noise-induced  narrowing  and  onset  of  the 
zero-dispersion  peak 

A  peculiar  situation  of  particular  interest  ar¬ 
ises  when  the  dependence  of  the  eigenfrequency 
t«>(£)  on  the  vibration  energy  E  is  nonmonotonic 
and  for  some  energy  E  the  derivative  oifE) 
passes  through  zero, 

[da,(£)/d£]£=£^  =  0,  =  (8) 

(cf.  fig.  4;  for  convenience  in  understanding  the 
experimental  data  in  figs.  2,  5  we  have  chosen  in 
fig.  4  an  initial  slope  =  [dw(£)/d£]£=o  that  is 
opposite  in  sign  to  that  in  fig.  1,  but  which 
corresponds  to  the  particular  system  considered 
below).  If  (8)  is  fulfilled  there  are  two  “cutoff’ 
frequencies,  (Og  and  (o^.  For  small  noise  inten¬ 
sities,  T <  E^,  when  the  vibrations  with  the 
eigenfrequencies  close  to  do  not  come  into 
play,  the  behaviour  of  with  increasing  T  is 
described  by  the  results  of  the  preceding  sub¬ 
section. 

However,  for  T  approaching  E^  and  the  posi¬ 
tion  of  the  maximum  of  Q{a))  approaching 
respectively,  the  “flattening”  of  (o(E)  becomes 
more  and  more  marked.  In  essence,  as  is  obvious 
from  the  above  arguments,  the  peak  of  Q((o)  is 
“pressed”  against  the  frequency  (o^:  vibrations 
with  higher  and  higher  amplitudes  are  being 
excited,  and  their  eigenfrequencies  approach  oi^. 


Fig.  4.  Variation  of  eigenfrequency  <*>(£)  with  energy  E  for 
the  particular  oscillator  described  by  ( 1 ),  (7),  with  A  =  2.  It  is 
the  existence  of  an  extremum  in  u>{E)  that  is  responsible  for 
the  noise-induced  spectral  narrowing  and  zero-dispersion 
spectral  peaks  discussed  in  the  text. 

But  there  are  no  eigenfrequencies  beyond  this 
cutoff.  As  a  result  the  peak  becomes  narrower 
with  increasing  T  and  also  becomes  steeper  on 
the  tu^-side,  i.e.  the  exact  opposite  oi  the  situa¬ 
tion  for  small  T. 

Spectral  narrowing  was  first  observed  in  an 
analogue  experiment  and  then  described  in  detail 
theoretically  [27].  The  theory  reduced  the  prob¬ 
lem  of  calculating  the  peak  of  the  SDF  to  a 
boundary-value  problem  for  an  ordinary  dif¬ 
ferential  equation  related  to  the  Fokker-Planck 
equation  for  a  noise-driven  oscillator:  the  former 
equation  was  a  Fourier-transformed  (over  time) 
equation  for  diffusion  in  energy,  but,  in  contrast 
to  Kramer’s  paper  [29],  it  was  the  equation  not 
for  the  phase-independent,  but  for  the  phase- 
dependent  (as  exp(in0),  with  |/i|  =  1  in  the  pres¬ 
ent  case)  part  of  the  distribution  function. 

The  experimental  and  theoretical  results  for 
the  model  (1),  (7),  demonstrating  the  noise- 
induced  narrowing  of  the  spectral  peak,  are 
shown  in  fig.  5.  We  would  note  that  the  model 
(7)  is  extremely  simple  in  that  it  contains  only 
one  control  parameter  A  which  might  be  associ- 
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Fig.  5.  Spectral  densities  Q(<o)  of  the  fluctuations  of  the 
oscillator  described  by  (1),  (7)  for  r  =  0.0143  and  the  asym¬ 
metry  parameter  A  =  2,  as  measured  (histograms)  in  an  ana¬ 
logue  electronic  experiment  [27]  for  comparison  with 
theoretical  predictions  (curves)  for  noise  intensities:  (a)  T- 
0.078;  (b)  0.687;  (c)  3.04.  Note  the  narrowing  of  the  width  at 
the  half-height  as  the  noise  intensity  is  increased  between  (b) 
and  (c). 


ated,  e.g.,  with  an  electric  field  for  an  oscillating 
charged  particle,  or  a  static  pressure.  For  A  =  0 
the  eigenfrequency  o)(E)  increases  monotonical- 
ly  with  E  and  there  is  no  spectral  narrowing  (cf. 
fig.  2).  The  nonmonotony  of  (0(E)  arises  for 
I  A|  >8/7^^^ -0.43,  and  starting  with  slightly 
higher  |A|  (because  of  finite  damping;  the  data 
refers  to  T- 0.015)  the  nonmonotony  of  the 
peak  width  vs  T  was  observed.  The  theory  is 
evidently  in  excellent  agreement  with  the  experi¬ 
ment,  and  we  would  stress  that  it  does  not 
contain  any  adjustable  parameter. 

A  very  interesting  phenomenon  arises  in  sys¬ 
tems  with  nonmonotonic  to{E)  for  still  smaller 
damping  r/to^  [3];  the  onset  of  an  additional 
narrow  peak  in  the  SDF  at  the  extreme  fre¬ 
quency  <0^  for  sufficiently  high  noise  intensities. 
Oualitatively,  such  a  zero-dispersion  peak  arises 
because  the  system  spends  a  relatively  long  time 
oscillating  at  frequencies  close  to  (o^ :  for  E-  E^ 
fluctuations  over  energy  have  little  effect  on  the 
frequency  or  phase  of  the  eigenvibrations.  The 
characteristic  width  of  the  peak  can  be 
readily  obtained  by  noting  that  is  due  to  the 


frequency  diffusion  over  the  time  8/~(8<u^j)  ; 

in  its  turn,  the  frequency  diffusion  is  due  to 
energy  diffusion  over  the  time  8/;  8£  ~ 

(ArTI^co^bt)'  ~  (cf.  [29]),  where  4  is  the  action 
for  the  vibrations  with  the  energy  E^.  Therefore, 

bto^,  =  {2r\<o:\TI,u>j''\ 
a>';=[d-a>(£)/d£-],.  , 

/^  =  f  co-'(E)dE  .  (9) 

0 

We  note  that  the  change  in  frequency  oj(E)  over 
a  time  (8w^j)''  due  to  the  drift  in  energy  is  of 


_3 

5 


/'  I 


Fig.  6.  Spectral  densities  Q{aj)  of  the  fluctuations  of  an 
electronic  model  [3()|  of  the  oscillator  described  by  (1).  (7) 
foi  very  small  damping  21’=  1.70  x  lo  '  and  the  asymmetry 
parameter  A  =  2.  for  several  noise  intensities:  (a)  r=0.1(K): 
(b)  0.203;  (c)  0.320;  (d)  0.409;  (e)  0.485;  (t)  0.742.  The 
zero-dispersion  peak  is  the  sharp  "spike"  that  first  appears  in 
(d);  it  rapidly  grows,  overwhelming  the  usual  spectral  peak  as 
T  increases,  in  (f). 
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order  and  can  thus  be 

neglected,  i.e.  the  broadening  of  the  peak  is 
purely  diffusional.  The  shape  of  the  zero-disper¬ 
sion  peak  is  described  by  the  expression  [3] 

SG^d(<*')  =  const.  X  exp(-£^/7’) 

X  |5[(a>  -  sgn(w")/8".d]l  < 

X 

S(jc)  =  Re  J  dtexp(-ijc/) 

0 

x[(l-i)sinh((l-i)0]'''^  (10) 

Analogue  simulations  of  the  model  (1),  (7)  have 
made  it  possible  to  reveal  the  zero-dispersion 
peak  [30].  The  evolution  of  the  SDF  with  in¬ 
creasing  temperature  for  very  small  damping, 
r  =  8.5  X  10”“*,  is  shown  in  fig.  6.  It  is  obvious 
from  this  figure  that  the  zero-dispersion  peak 
emerges  very  suddenly  with  increasing  tempera¬ 
ture,  and  then  grows  rapidly  to  dominate  the 
spectrum.  The  sharp  “outburst”  of  the  peak  (due 
to  the  competition  of  the  exponentially  small 
occupation  of  the  energies  E~  for  small  T 
and  the  sharpness  of  the  peak  itself)  has  recently 
been  described  analytically  [31],  and  the  theory 
has  been  demonstrated  [30]  to  be  in  good  agree¬ 
ment  with  the  experiment. 

2.3.  Zero-frequency  peaks  in  SDFs  of 
monostable  system 

A  well-known  feature  of  nonlinear  vibrations 
is  that  they  are  not  strictly  sinusoidal:  in  addition 
to  the  fundamental  frequency  <i){E)  there  also 
exist  overtones  na){E)  («  =  2, 3, . . .)  in  their 
Fourier  spectrum.  It  is  to  be  expected,  therefore, 
that  in  addition  to  the  peak  in  the  SDF  corre¬ 
sponding  to  the  main  tone  (see  above)  there  will 
also  be  peaks  corresponding  to  the  overtones. 
Peaks  of  this  sort  have  indeed  been  observed, 
e.g.,  in  the  absorption  spectra  of  localized  vibra¬ 
tions  in  solids  [32]  (see  [2]  for  a  review).  Their 
width  increases  with  the  number  n  of  the  over¬ 
tone  (cf.  [33])  and  exceeds  that  for  the  main 
tone. 


For  an  underdamped  oscillator  fluctuating  in 
an  asymmetric  potential  well  there  arises,  in 
addition,  a  well-resolved  comparatively  narrow 
peak  in  the  SDF  at  zero  frequency  [24.  34]  (we 
note  that  for  overdamped  oscillators  the  peak  at 
zero  frequency  is  the  only  one  in  the  spectrum). 
The  quantum  theory  of  a  corresponding  peak  in 
the  absorption  spectra  of  weakly  nonlinear  local¬ 
ized  vibrations  was  given  in  ref.  [35]. 

The  zero-frequency  peak  in  the  SDF  of  the 
coordinate  q  is  related  to  the  fact  that,  in  asym¬ 
metric  potential  wells,  the  fluctuations  of  the 
oscillator  energy  E  give  rise  to  fluctuations  of  the 
centre  of  the  vibrations  with  a  given  energy, 
q„(£).  These  fluctuations  are  “slow”,  with  a 
characteristic  time  scale  equal  to  the  relaxation 
time  r  '.  They  are  purely  relaxational  and  are 
not  associated  with  any  finite  frequency,  and 
thus  the  corresponding  SDF  peak  should  be 
positioned  at  zero  frequency  and  have  a  half¬ 
width  of  order  F.  A  simple  theory  shows  that, 
for  small  noise  intensities,  the  shape  of  the  zero- 
frequency  peak  is  given  by  the  expression  [34] 


^  .  X  1  ,2.r.2  2r 

Qo(*')  ~  ^0  T  Ap2,2 

TT  4i  -r  a> 


4(a>>„)r-\ 
4r-  +  a>-  ' 


'7o  =  ld<7o(£^)/d£]£=o^ 
q"^[d\{E)/dE\^, 


(11) 


An  important  feature  of  the  zero-frequency 
peak  is  that  it  is  not  affected  by  the  straggling  of 
the  frequencies  of  eigenvibrations  induced  by  the 
combined  effects  of  noise  and  nonlinearity  (see 
above).  Therefore  it  does  not  broaden  rapidly 
with  increasing  noise  intensity.  It  is  because  of 
this  that  the  zero-frequency  peak  in  the  SDF  is 
resolved  much  better  than  the  peaks  at  the  over¬ 
tones:  peaks  of  both  types  are  due  to  nonlineari¬ 
ty  of  the  vibrations,  and  therefore  their  inten¬ 
sities  increase  with  noise  strength,  but  the  width 
of  the  zero-frequency  peak  becomes  much  smal¬ 
ler  for  noise  strengths  beyond  7’~r/|a>,',|  and. 
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correspondingly,  it  is  much  higher.  In  addition, 
for  relatively  small  noise,  the  intensity  of  the 
peak  at  the  second  overtone  (the  “main”  over¬ 
tone  for  weak  noise)  contains  an  extra  numerical 
factor  5  [33]  compared  to  that  of  the  zero- 
frequency  peak.  An  overall  view  of  the  SDF  for 
the  oscillator  (1),  (7)  as  obtained  for  the  relevant 
electronic  model  and  described  theoretically, 
with  a  clearly  visible  zero-frequency  peak,  is 
shown  in  fig.  7.  The  insert  demonstrates  that  the 
broadening  of  this  peak  with  increasing  noise  is 
indeed  small  and  that  sometimes,  rather  than 
broadening,  noise-induced  narrowing  may  occur; 
this  follows  from  eq.  (11). 

In  concluding  this  section,  we  note  that  the 
shape  of  the  fundamental  peak  for  not  very  weak 
noise,  when  the  main  broadening  mechanism  is 
the  fluctuational  one,  reflects  the  stationary  dis¬ 
tribution  of  the  system  over  its  energy  (the  peak 


CJ 


Fig.  7.  Spectral  density  Q(<o)  of  the  fluctuations  of  the 
oscillator  described  by  (1),  (7)  for  damping  /’  =  0.0143  and 
with  the  asymmetry  parameter  A  =  2.0,  for  a  noise  intensity 
T  =  0.814.  The  full  spectrum  (except  for  the  overtones)  is 
plotted,  with  the  zero-frequency  peak  on  the  left-hand  side 
and  the  peak  corresponding  to  eigenvibrations  at  the  fun¬ 
damental  frequency  on  the  right-hand  side.  The  histogram 
represents  data  from  an  electronic  model,  and  the  full  curve 
represents  the  theory  [34].  Inset:  the  variation  of  the  width 
(defined  as  the  half-width  at  half-maximum)  of  the  zero- 
frequency  peak,  as  a  function  of  noise  intensity  T  for  three 
values  of  the  asymmetry  parameter:  (a)  A  =  0.2;  (b)  0.43;  (c) 
2.0.  The  data  points  represent  measurements  on  the  elec¬ 
tronic  model,  and  the  full  curves  represent  the  theory. 


gives  the  "projection"  of  this  distribution  on  the 
distribution  over  the  frequencies  oi{E)).  There¬ 
fore  it  is  quite  sensitive  to  the  characteristics  of 
the  driving  noise,  whereas  the  shape  of  the  zero- 
frequency  peak  is  much  less  sensitive  to  these 
characteristics. 


3.  Super-narrow  spectral  peaks  in  the  SDFs  of 
bistable  systems 

Many  physical  systems  of  particular  interest 
have  not  one,  but  two  or  more  coexisting  attrac¬ 
tors.  These  may  be  potential  minima  for  a  diffus¬ 
ing  particle  (e.g.,  for  an  impurity  in  a  solid,  or  a 
reorientating  molecule)  or  coexisting  regimes  of 
laser  generation,  passive  optical  transmission,  or 
forced  oscillations  of  an  electron  in  a  Penning 
trap  [36],  etc.  A  quite  general  feature  of  fluctua¬ 
tions  in  bistable  (or  multistable)  systems  is  that, 
in  addition  to  the  relaxation  time  (times) 
characterising  the  dynamics  in  close  vicinity  to 
one  of  the  attractors,  the  fluctuations  are  also 
characterised  by  much  larger  times  associated 
with  the  noise-induced  transitions  between  the 
attractors.  These  are  equal  to  the  reciprocal 
transition  probabilities  ‘  (/,  j  enumerate  the 
attractors,  /,  /=1,  2).  For  a  broad  class  of 
systems  driven  by  Gaussian  noise  the  depen¬ 
dence  of  on  the  characteristic  noise  intensity 
D  is  of  the  activation  type  (see  [4,37-41]  and 
references  therein), 

=  const.  X  exp(-/?,/D) .  (12) 

Here,  R,  can  be  associated  with  the  activation 
energy  of  the  transition  from  the  state  /  (in 
Kramers’  model  [29]  of  the  activation  of  a  Brow¬ 
nian  particle  over  a  potential  barrier,  /?,  is  the 
height  of  the  barrier  and  D  is  the  temperature). 
It  is  obvious  from  (12)  that  for  sufficiently  weak 
noise 

{R,>D). 


(1-3) 
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It  is  the  inequality  (13)  that  makes  the  concept 
of  transition  probabilities  sensible. 

Fluctuational  transitions  give  rise  to  fluctua¬ 
tions  of  the  instantaneous  populations  »v,(0< 
H'2(r)  of  coexisting  attractors.  The  kinetics  of  the 
populations  is  described  by  the  balance  equation. 


W^(t)  =  +  W21*^'2(0  > 

(14) 


(Of  and  its  overtones  ntoy.  including  «  =  0;  those 
in  the  SDF  of  the  coordinate  of  the  bistable 
system  are  described  by  the  expression  |42] 


Qu\o^) 


i  tVi>V:ki(n)-  ^2(n)|-W 
TT  W"  +  (te)  —  nUfY 


w=  w,,  +  w,, . 


(16) 


The  interwell  fluctuations  become  pronounced  in 
the  range  of  parameters  where  the  stationary 
values  of  the  populations,  w,  and  w,,  are  of  the 
same  order  of  magnitude  (obviously,  because 
otherwise  a  system  spends  practically  all  its  time 
near  one  of  the  attractors).  This  parameter  range 
is  quite  narrow  for  weak  noise,  since  according 
to  (14)  the  ratio  of  the  stationary  populations. 


Here,  <7y(/i)  is  the  value  of  the  nth  Fourier 
component  of  the  coordinate  for  the  attractor  j: 
because  of  the  periodicity  of  the  forced  vibra¬ 
tions,  the  coordinate  q(t)  for  the  yth  attractor  can 
be  expanded  as 

X 

[^(0];=  S  q,in)(^xpiin(Of[) .  (16a) 

/I  =  -  * 


wjw2  =  ~  const.  X  exp((f?,  -  /?2)/D] , 

(15) 

is  either  exponentially  large  or  small:  for  most 
parameter  values,  -  /Jjl  ^  ^  3t  small  D  (cf. 
(13)).  The  region  where  =  /?2  can  reasonably 
be  called  the  range  of  a  kinetic  phase  transition, 
by  analogy  with  first-order  phase  transitions  in 
thermal  equilibrium  systems  where  the  popula¬ 
tions  of  the  phases  (e.g.  molar  volumes,  for  a 
liquid-vapour  transition)  are  of  the  same  order 
of  magnitude. 

The  fluctuations  of  the  populations  cause  large 
(of  the  order  of  the  distance  between  the  attrac¬ 
tors)  fluctuations  of  the  coordinate,  momentum, 
amplitude  of  forced  vibrations,  etc.  It  would  be 
expected  therefore  that,  in  the  region  of  a  kinetic 
phase  transition,  there  will  arise  very  intense  and 
very  narrow  (with  a  width  of  the  order  of  the 
transition  probability)  peaks  in  the  SDFs  of  bi¬ 
stable  systems  [42]  (similar  peaks  in  suscep¬ 
tibilities  were  considered  in  [4];  cf.  also  [43]).  In 
the  case  of  bistability  displayed  in  a  periodic  field 
with  frequency  <Of,  such  supernarrow  fluctua- 
tional-transition-induced  peaks  are  positioned  at 


(in  practice,  for  finite  noise  intensities,  q^in) 
differ  slightly  from  their  zero-noise  values;  this 
difference  is  neglected  in  what  follows).  We  note 
that,  for  the  particular  case  of  an  overdamped 
system  performing  Brownian  motion  in  a  static 
bistable  potential,  an  expression  of  the  type  (16) 
(with  n  =  0)  was  given  in  [44];  the  supernarrow 
zero-frequency  peak  was  considered  also  in  [24, 
25,  45]. 

A  supernarrow  peak  at  the  frequency  of  a 
driving  periodic  field  was  observed  and  the  vari¬ 
ation  of  its  intensity  with  the  parameters  of  the 
system  was  investigated  in  [46].  The  system  ana¬ 
lysed  was  an  analogue  electronic  model  of  an 
underdamped  single-well  Duffing  oscillator  de¬ 
scribed  by  (1),  (7)  with  A  =  0,  and  the  driving 
field  Fcos((Oft)  was  nearly  resonant,  |wp  -  w„| 
(Of.  This  system  is  closely  related  in  particular  to 
the  case  of  a  relativistic  electron  in  a  Penning 
trap:  the  motion  of  such  an  electron  displays 
bistability  in  a  sufficiently  strong  field  with  a 
frequency  close  to  the  cyclotron  frequency  [36]. 
The  sharp  onset  of  the  supernarrow  peak  with 
variation  of  the  dimensionless  field  intensity  /3, 

/3  =  3F^/32(Ofj(Of-(o,f 


(17) 
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Fig.  8.  Spectral  densities  Q(oj)  of  the  fluctuations  of  the  oscillator  (1),  {!)  with  A  =  0  driven  by  a  strong,  nearly  resonant,  periodic 
force  F  cosiojft),  plotted  as  a  function  of  h<o  =  (Wf  -  to,,)  IF  for  three  values  of  the  dimensionless  field  intensity:  (a)  fi  =  0.048;  (b) 
0.078;  (c)  0.150.  The  histograms  are  measurements  from  the  electronic  model,  and  the  full  curves  are  theoretical  predictions  (4()]. 
The  supemarrow  spectral  freak  appears  at  8<ii  =  0  in  (b). 

is  shown  in  fig.  8.  The  width  of  the  peak  could  ln(iv,Wj)  =  -  \R[  -  r;|  1/3  -  /3^1/D  ,  (18) 

not  be  resolved.  The  critical  dependence  of  the 

intensity  of  the  peak  on  the  distance  (in  parame-  which  gives  the  logarithm  of  the  intensity  of  the 

ter  space)  to  the  phase-transition  point  is  clearly  peak  (16)  with  account  taken  of  (15).  The  quan- 

evident  in  fig.  9.  The  full  curves  correspond  to  titles  R[,  R'2  in  (18)  are  the  derivatives  of  the 

the  expression  transition  activation  energies  (cf.  (12))  with  re¬ 

spect  to  the  controlling  parameter  /3,  evaluated 
at  the  phase-transition  point  /S^;  they  were  de¬ 
termined  quite  independently  from  measure¬ 
ments  of  the  transition  probabilities.  The  data 
clearly  demonstrate  that  the  exp)erimental  results 
are  self-consistent  and  also  provide  some  insight 
into  the  origin  of  the  supernarrow  peak. 

A  related  problem  of  considerable  interest  is 
that  of  the  influence  of  the  characteristics  of  the 
noise  on  the  supernarrow  peaks.  The  only  such 
characteristics  entering  the  expression  for  the 
peak  shape  (16)  are  the  transition  probabilities 
0  06  0  07  0  08  0  09  0  1  0  11  that,  from  (12),  seem  to  depend  on  the  noise 

^  only  in  terms  of  its  intensity  (for  Gaussian 

Fig.  9.  Variation  of  the  intensity  /  of  the  supemarrow  peak  noise).  However,  the  values  of  the  activation 

with  distance  from  the  kinetic  phase  transition  (46|.  The  energies  are  highly  sensitive  to  the  shape  of  the 

square  data  points  represent  direct  measurements.  The  cross-  ri. 

es  are  theoretical  values  calculated  from  measured  transition  power  spectrum  of  the  noise  [37-41]  and,  by 

rates,  and  the  full  lines  represent  (18).  varying  this  shape,  one  can  not  only  produce 


20 


M.l.  Dykman,  P.V.E.  McClintock  /  Spectra  of  noise-driven  nonlinear  systems 


marked  changes  in  /?,,  but  also  shift  the 
position  of  the  phase-transition  point. 

We  note  in  conclusion  of  this  section  that  the 
supernarrow  peaks  in  the  SDF’s  and  the  suscep¬ 
tibilities  of  bistable  systems,  in  particular  systems 
displaying  bistability  in  a  strong  periodic  field, 
are  not  only  of  interest  as  a  means  of  studying 
kinetic  critical  phenomena  e.g.  in  revealing  the 
phase  transition  itself;  they  also  provide  a  basis 
for  the  tunable  filtering  and  detection  of  weak 
periodic  signals. 

4.  Stochastic  resonance  in  bistable  systems: 
linear  and  nonlinear  effects 

An  important  phenomenon  inherent  to  fluc¬ 
tuating  bistable  systems,  one  that  occurs  in  the 
range  of  the  kinetic  phase  transition,  is  stochastic 
resonance  (SR).  In  fact,  there  are  two  distinct 
groups  of  phenomena  both  called  SR.  Originally 
[5],  the  term  was  used  of  periodically  driven 
bistable  systems  to  describe  the  dome-like  (bell¬ 
shaped),  seemingly  resonant,  dependence  on 
noise  intensity  of  the  depth  of  the  periodic  mod¬ 
ulation  [4]  of  the  instantaneous  populations 
»v,(r),  H’2(r)  of  the  stable  states.  The  other,  more 
general,  perception  of  SR  [7]  (which  includes  the 
first  type  of  SR  as  a  subset)  is  simply  as  the 
increase  and  subsequent  decrease  with  increasing 
noise  intensity  of  the  response  to  a  periodic  field, 
i.e.  of  the  susceptibility  of  the  system.  Viewed  in 
the  latter  way,  SR  is  no  longer  restricted  to 
bistable  systems,  but  can  arise  in  monostable 
ones  as  well,  as  has  been  demonstrated  very 
recently  [47]. 

In  what  follows,  however,  we  concentrate  on 
SR  in  bistable  systems  and  we  consider  the  phe¬ 
nomena  associated  with  the  modulation  of  the 
instantaneous  populations  of  the  stable  states.  It 
is  clear  that  this  modulation  will  give  rise,  in 
turn,  to  a  strong  modulation  of  the  coordinates, 
momenta,  and  other  dynamic  characteristics,  i.e. 
it  represents  a  strong  overall  response  of  the 
system  to  the  field.  Of  course,  the  effect  will  only 


come  into  play  when  the  noise  intensity  is  large 
enough  for  transitions  to  occur  between  the  sta¬ 
ble  states:  thus,  the  effect  can  be  promoted  by 
noise  and,  consequently,  in  a  certain  interval  of 
noise  intensity,  the  coherent  periodic  response  of 
the  system  increases  with  increasing  noise.  It  is 
also  evident  that,  being  associated  with  the  redis¬ 
tribution  over  the  wells,  SR  is  closely  related  to 
the  onset  of  the  supernarrow  peaks  considered  in 
the  preceding  section. 

There  are  several  physical  observables  display¬ 
ing  an  SR-type  dependence  on  noise  (cf.  refs. 
(4-7,  11,  12,  48]).  We  shall  analyse  first  a  (slight¬ 
ly  modified  compared  to  (4))  SDF  of  a  bistable 
system  driven  by  trial  field.  It  follows  from  the 
general  concepts  of  statistical  physics  [1]  that  the 
average  value  of  the  coordinate  of  a  system 
driven  by  a  periodic  force  A  cos(/2r)  oscillates 
with  the  period  2ttIO: 

X 

{q{t))  =  X  ain)  cos{nflt  +  Mn))  (19) 

/i=() 

(if  the  system  is  driven  by  two  fields  there  are 
terms  in  (19)  with  both  frequencies,  and  also 
with  their  combinations:  see  below).  It  is  clear 
from  (19)  that  if  we  define  the  SDF  of  the 
coordinate  as 

S(a>)  =  lim  (4Trr„)  '  I  d/ ^(f)  exp(ia>r) 

I  J 

~ln 

(4a) 

it  will  contain  5 -shaped  peaks  at  the  frequency  O 
and  its  overtones.  The  intensity  5„  (total  area)  of 
the  peak  at  the  frequency  nil  is 

5,.=  ia'(/i).  (20) 

It  was  suggested  in  [7]  that  SR  could  conveni¬ 
ently  be  characterized  by  the  ratio  p  of  the 
trial-field-induced  spike  in  5(ft;)  at  the  frequency 
ft  to  the  value  Q{fl)  =  S(fl)  of  the  SDF  in  the 
absence  of  trial  field. 
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p^S,IQ(0)  (21) 

(the  so-called  signal-to-noise-ratio).  It  is  quite 
straightforward  to  determine  this  ratio  ex¬ 
perimentally  and  it  provides  an  important  mea¬ 
sure  of  the  system’s  response  to  a  trial  field. 

4.1.  Linear  response  approximation 

The  easiest  way  to  gain  insight  into  SR  and  to 
find  p  is  based  on  the  fact  that,  for  sufficiently 
weak  trial  fields  (see  below),  the  amplitudes  of 
the  harmonics  a{n)  in  (19)  decrease  very  rapidly 
with  increasing  n  so  that,  to  a  good  approxi¬ 
mation,  it  suffices  to  allow  for  the  forced  oscilla¬ 
tions  at  the  frequency  {1  only,  i.e.,  to  retain  in 
(19)  only  the  terms  with  «  =  0,  1.  The  term  with 
n  =  0  describes  the  time-independent  par*  of 
(qit)),  and  it  remains  unchanged  to  first  order 
in  the  field  amplitude;  the  main  effect  of  the 
weak  field  is  the  onset  of  the  term  with  n  =  \. 
Taking  account  only  of  these  two  terms  consti¬ 
tutes  the  linear  response  approximation  [1].  The 
linear  response  is  fully  characterised  by  a  suscep¬ 
tibility  xi<^)  [1,49]: 

a(l)  =  /l|Ar(f2)| ,  P  =  \A^\x{ntlQ{Q) , 

(22) 

<^(1)  =  tl>  =  -arctan[lm  ;^(/2)/Re  ^'(1^)]  • 

The  susceptibility  can  be  calculated 

analytically  for  some  simple  model  systems  [4,  8, 
10,  50).  It  should  be  noted,  however,  that  there 
is  a  broad  class  of  systems  of  interest  where  ;t'(") 
can  be  obtained  from  experimental  measure¬ 
ments  of  the  SDF  in  the  absence  of  periodic 
driving,  while  a  simple-minded  analytical  theory 
works  only  for  a  narrow  range  of  parameters. 
This  is  the  class  of  systems  which  are  in  thermal 
equilibrium  (or  quasiequilibrium).  If  a  perturbing 
field  is  potential,  i.e.,  its  effect  on  a  system  can 
be  determined  by  an  extra  term  -  Aq  cos(l}t)  in 
the  Hamiltonian  of  the  system,  can  be 

expressed  in  terms  of  Q((o)  via  the  fluctuation- 


dissipation  relations: 

Re  x(co)  =  ^  j  dot,  Q(aj,)  OJ ;((o- -  to-) 

1) 

Im  x(t^)  =  ^  Q(t^) .  (23) 

where  the  bar  on  the  integral  implies  that  we 
should  take  the  Cauchy  principal  part.  Some 
experimental  data  demonstrating,  on  one  hand, 
the  onset  of  SR  in  the  signal-to-noise  ratio  p. 
and,  on  the  other  hand,  the  applieability  of  the 
relations  (23)  are  shown  in  fig.  10.  They  refer  to 
a  Brownian  “partiele"  (1)  fluetuating  in  a  sym¬ 
metric  double-well  potential 

ty(q)=--2  q'+jq"-  (24) 

The  two  sets  of  data  were  obtained  from  an 
analogue  electronie  circuit  [51]  simulating  (1), 
(24)  in  two  different  ways:  first  (squares)  by 
measuring  p  directly  for  the  periodically  driven 
system;  and  secondly  (pluses)  by  making  use  of 
the  measured  Q((o)  obtained  in  the  absence  of 
periodic  driving  and  of  eqs.  (22),  (23).  It  is 

12  . . 
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Fig.  10.  Stochastic  resonance  (10):  the  signal/noise  ratio 
p  =  1,54  X  lO'p.  defined  by  (21).  measured  for  an  electronic 
model  of  the  oscillator  (I),  (24)  driven  by  a  weak  periodic 
field,  is  plotted  as  a  function  of  reduced  noise  intensity 
T/iiU.  The  square  data  points  are  direct  measurements;  the 
crosses  arc  derived  bom  (22),  (2.4).  based  on  measurements 
of  the  SDF  in  the  absence  of  the  periodic  force.  There  are  no 
adjustable  parameters. 
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immediately  evident  from  fig.  10  that  p  has  a 
distinct  maximum  as  a  function  of  the  noise 
intensity  T  thus  demonstrating  stochastic  reso¬ 
nance,  and  also  that  the  two  ways  of  obtaining  p 
give  identical  results. 

Explicit  expressions  for  Q{cj)  and  ;^(w)  for  a 
Brownian  particle  (1)  fluctuating  in  a  bistable 
potential  U{q)  can  be  obtained  in  the  range  of 
relatively  small  noise  intensities, 

^U,  =  UiqJ-Uiq,)  (/  =  1,2),  (25) 

where  <?,  2  are  the  positions  of  the  minima  of  the 
potential  U{q)  and  q^  is  that  of  the  local  maxi¬ 
mum,  so  that  U'{qi  2)  =  U’(qj  =  0,  <?,< 

q2.  In  this  range,  Qito)  and  are  given  [42] 
by  the  sums  of  the  “partial”  contributions  from 
fluctuations  about  quilibrium  positions  q^  2 
and  those  from  into  well  transitions: 

Q(a>)=  7  + 

i-  1.2 

S  *(")  •  (26) 

1=1.2 

In  eq.  (26)  w,  are  the  stationary  populations  of 
the  stable  states  1,  2  (cf.  eqs.  (14),  (15)).  The 
partial  spectra  G,(")  in  the  low-noise  range  (25) 
for  underdamped  systems  at  to  close  to 
(t/"(9,))’^^  are  given  by  eq.  (6),  while  in  the 
range  of  interest  for  SR,  <0  ^  (f/"(  ^ 

QX<o)  =  irT/itUf  +  2,0(02),  U':  -  U'Xq,) , 

(27) 

Here,  Q,o(o>)  is  the  zero-frequency  pjeak  due  to 
the  local  asymmetry  of  the  potential  about  the 
bottom  of  the  tth  well;  it  is  described  by  the 
expression  (11)  for  Qaio),  with  q^,  ql  calculated 
for  the  corresponding  well.  Alternatively,  for 
overdamped  systems, 

QX<o)  =  irriTtiuf  +  4rW) ,  r  >  (t/")*'" 

(27a) 


(a  more  detailed  expression  that  allows  for  the 
corrections  ~7/At/,  is  given  in  [50]).  The  expres¬ 
sion  for  the  interwell-transition-induced  contri¬ 
bution  2|‘’*(a>)  in  (26)  is  given  by  eq.  (16);  only 
the  term  with  «  =0  in  (16)  contributes  to  (26)  in 
the  particular  case  under  consideration.  The  val¬ 
ues  of  the  “partial”  susceptibilities  a^,(w)  and  of 
the  interwell-transition-induced  term  ;^"["’(w)  in 
the  susceptibility  are  expressed  in  terms  of 
2*,'’'(a>)  by  the  relations  (23). 

The  expressions  (16),  (26),  (27).  (27a)  explain 
(cf.  also  [52])  the  dependence  of  p  on  T  plotted 
in  fig.  10:  for  very  small  noise  intensities  the 
inequality  IT  17  holds,  and  the  interwell  transi¬ 
tions  contribute  neither  to  the  SDF  nor  to  the 
susceptibility  so  that,  according  to  (23),  (27), 
(27a)  p  decreases  roughly  as  7”'  with  increasing 
T.  This  is  because  the  partial  spectra  Qiico)  are 
proportional  to  T,  whereas  the  susceptibilities 
Xiica)  are  seen  from  (23)  to  be  T-independent. 
The  increase  of  p  starts  for  those  T  where  W 
becomes  of  order  of  17.  In  the  range  where  the 
interwell-transition-induced  terms  are  dominant 
both  in  the  SDF  and  susceptibility,  one  arrives  at 
the  simple  expression 

p  =  j'irA‘w,W2VF(^,  -  q2)'/T'  , 

W=W,2  +  W2,  , 

(28) 

It  is  seen  from  (12),  (18),  (28)  that  the  de¬ 
pendence  of  the  signal-to-noise  ratio  on  the 
noise  intensity  is  of  the  activation  type, 
p^cxp{-^U^JT)  where  is  the  depth  of 

the  deeper  well. 

Because  the  onset  of  SR  is  related  to  the 
supernarrow  interwell-transition-induced  peak,  a 
strong  amplification  of  the  response  to  a  weak 
trial  periodic  field  would  be  expected  to  occur 
(for  the  present  case  of  motion  in  a  static  poten¬ 
tial)  at  comparatively  small  frequencies  where 
the  supernarrow  peak  at  zero  frequency  can 
dominate  the  SDF: 
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(29) 

The  dependence  of  p  on  T  for  the  range  of  the 
parameters  outside  the  restrictions  in  (28)  is  still 
greatly  simplified  (compared  to  that  given  by 
(23),  (27a))  for  the  particular  situation  of  over¬ 
damped  motion  in  a  symmetric  double-well 
potential, 

Uiq)  =  Ui-q),  2r>{U")''' 
iU'’^U’\q,^,)).  (30) 

In  this  case,  for  sufficiently  small  frequencies, 

p  =  iTiA^t4T)ifu''^  +  n^)iifu"^  +  irn^) , 
/=(92-9.)'w^/4r, 

{l.W,{TI2r{q^-q,f]<U"l2r ,  (31) 

For  the  same  model,  and  in  the  same  range  of 
parameters,  the  phase  shift  between  the  signal 
(^(r))  and  the  driving  force 

<t>  =  -arctan((f2/t/") 

X  ifu"^  -f  2m^)/{fwu"  +  n^)] .  (32) 

According  to  (31)  the  signal-to-noise  ratio  is 
minimal  for  the  value  of  T  given  by  the  expres¬ 
sion  fU"  -  O,  and  it  increases  rapidly  for  higher 
7(cf.  (28)).  The  maximum  of  p  vs.  7  is  reached 
in  the  region  7~  At/,  which  is  not  described  by 
the  above  analytic  expressions  for  Q{oj),  ;t'(<w). 
but  is  still  described  by  the  fluctuation-dissipa¬ 
tion  relations. 

It  is  evident  from  (32)  that  the  phase  shift  also 
displays  an  SR-type  behaviour  [50].  From  phys¬ 
ical  intuition,  we  may  expect  to  provide  a 
measure  of  the  extent  to  which  the  external  field 
is  absorbed  by  the  system.  For  very  small  noise, 
where  the  interwell  transitions  do  not  come  into 
play,  it  follows  from  (32)  that  \<f>\  =  2rn/U"  is 
also  very  small:  intrawell  absorption  of  a  low- 
frequency  field  is  weak  (the  absorption  band  is 
broad,  with  the  width  U"/2r  >  Q).  The  increase 


of  |</>|  with  7  starts,  however,  for  quite  small  7 
where  p  is  still  decreasing;  |</>|  reaches  its  maxi¬ 
mum  value  when  7  =  is  still  small  compared 
with  Ail: 

{-<!>)„,,  =  arctan((|(93  -  ^.)'f^"/47,_]'  ^)  , 

n[4T^  Jiq,-  qS'U"]'  \  (33) 

It  can  be  seen  from  (33)  that  |</>La,  is  quite 
large,  i.e.,  there  then  is  a  strong  absorption  of 
the  periodic  field.  This  absorption  is  due  primari¬ 
ly  to  the  interwell  transitions.  We  note  that  the 
absorption  coefficient  itself,  which  is  propor¬ 
tional  to  Im  ;^(/2),  also  displays  an  SR-type  be¬ 
haviour.  Both  |<^|  and  Im  ;y(/2)  are  much  steeper 
on  the  small- 7  side  of  their  maxima,  because  it  is 
the  activation  dependence  of  the  transition  prob¬ 
abilities  on  7  that  determines  the  behaviour  of 
;t'(/2)  in  this  range. 

The  stochastic-resonance-like  dependence  of 
the  phase  shift  upon  noise  intensity  has  been 
clearly  demonstrated  in  analogue  electronic  ex¬ 
periments  [50].  Some  data  for  an  overdamped 


Fig.  1 1 .  The  phase  shift  -  <f>  (degrees)  between  the  weak 
periodic  force  A  cos  fit  and  the  averaged  coordinate  (q(t)). 
measured  for  an  electronic  model  of  the  overdamped  oscil¬ 
lator  described  by  (1),  (24),  (3)  with  f2=0.1  for  AIIT 
=  0.04  (circle  data  points)  and  /l/2/'  =  0.2  ^squares).  The 
dashed  curve  represents  the  simple  linear  resptmse  prediction 
(31);  the  full  curve  takes  account  of  nonlinear  corrections  for 
A tir  =  0.04  [50|. 
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oscillator  with  the  potential  (24)  are  shown  in 
fig.  11.  The  simple  expression  (31)  is  evidently  in 
excellent  agreement  with  the  experimental  data 
(the  prefactor  in  the  expression  (12)  for  the 
transition  probability  of  an  overdamped  system 
was  taken  to  be  of  the  standard  form  [29]);  the 
comparison  of  theory  with  experiment  does  not 
involve  any  adjustable  parameters. 

4.2.  Nonlinear  effects 

One  of  the  more  intriguing  features  of  the 
response  of  a  bistable  system  to  a  low-frequency 
field  (the  effects  for  high-frequency  fields  will  be 
considered  below)  is  the  possibility  of  observing 
strongly  nonlinear  effects,  even  for  small  field 
amplitudes  A.  Such  a  possibility  arises  because 
the  field-induced  modulation  of  the  populations 
of  the  attractors  comes  about  primarily  through  a 
modulation  of  the  activation  energies  of  fluctua- 
tional  transitions;  and  the  effect  of  the  latter 
modulation  is  enhanced  exponentially,  because  it 
is  with  the  small  noise  intensity  that  this  modula¬ 
tion  should  be  compared  (cf.  (12)).  It  is  seen 
from  the  expressions  (15)  that  the  parameter  g 
describing  the  redistribution  over  the  attractors 
is  of  the  form 


on  the  instantaneous  value  of  the  field  as  they 
would  if  it  were  a  fixed  parameter.  Then  accord¬ 
ing  to  (12)  the  instantaneous  transition  probabili¬ 
ty  lT,y(0  is  given  by  the  expression 


S  4(g,)exp(i^r2r) , 

k= 

D\dA/A^i,' 


(35) 


where  are  the  values  of  the  transition  prob¬ 
abilities  in  the  absence  of  the  field,  i.e.,  for 
A  =0,  are  modified  Bessel  functions  [53].  The 
periodic  dependence  of  the  transition  prob¬ 
abilities  on  time,  which  is  strongly  nonsinusoidal 
for  |g,|>  1,  gives  rise  to  the  nonsinusoidal  time 
dependence  of  the  instantaneous  state  popula¬ 
tions  tv,  2(/).  Eqs.  (14),  (35)  result  in  the  follow¬ 
ing  set  of  linear  algebraic  equations  for  the 
Fourier  components 

X 

2  tVi*  exp(i^f2f) , 

liA:f2-HV,2/„(g,)  +  Vy2,/n(g2)]*^^u 

=  w,,l,{g,).  (36) 


d 

dA 


[/?,(yl)-2?2(>l)] 


4=0  * 


(34) 


Here,  /?,(/!)  is  the  activation  energy  of  the  tran¬ 
sition  from  the  state  /  for  the  initial  system 
driven  additionally  by  a  weak  force  A,  and  the 
derivatives  are  calculated  for  A=0  (see  [lOa]; 
the  importance  of  a  parameter  of  this  kind  was 
also  recognised  recently  in  [11]).  The  variation  of 
2?,,  /?2  under  the  weak  force  A  is  assumed  small: 
accordingly,  only  terms  of  the  first  order  in  A 
will  be  taken  into  account  in  /?,(/4). 

In  considering  nonlinear  effects  we  shall  as¬ 
sume  the  field  A  cos(/2/)  to  be  slowly  varying, 
fl  <  U''l2r,  r  (cf.  (31)),  so  that  the  transition 
probabilities  can  be  considered  in  the  adiabatic 
approximation.  In  this  case  their  values  depend 


It  is  straightforward  to  express  the  amplitudes 
a{k)  and  phases  (f){k)  of  the  forced  vibrations  of 
the  system  (cf.  eq.  (19))  in  terms  of  For 

2:  >  1, 

a(k)  =  2\iq^  -  q2)w^^\  , 

d>(*)  =  arg[(<?, -(72)^1*]  (^>1)>  (37) 

while  the  expressions  for  a(l),  </>(!)  are  of  the 
form  (22)  with  the  susceptibility  x(^)  having 
been  replaced  by  x(f2): 


X(n)  =  vv,„Ar,(/2)  +  (1  -  W|„);f2(w) 

+  2A  (38) 

Eqs.  (35)-(38)  make  it  straightforward  to  com- 
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pute  the  response  to  a  slowly  varying  field  for 
arbitrary  nonlinearity.  They  obviously  go  over 
into  the  results  of  linear-response  theory  in  the 
limit  of  weak  field  where  l.gi  ,2|  1. 

The  extreme  nonlinear  case  1^1,2!  can  be 
also  analysed  analytically,  by  application  of  a 
quite  different  approach  [10a].  In  this  case  inter¬ 
well  transitions  from  the  state  i,  for  example, 
happen,  with  an  overwhelming  probability  dur¬ 
ing  »hat  part  of  the  period  of  the  driving  field 
lirUt  where  the  activation  energy  /?,(^)  's  mini¬ 
mal,  i.e.,  the  field  works  as  a  shutter  (we  stress 
that  the  field  itself  is  weak;  this  is  not  a  de¬ 
terministic,  but  a  probabilistic  shutter).  As  a 
result,  the  average  signal  at  the  output  will  be 
rectangular.  In  particular,  in  the  case  of  Brown¬ 
ian  motion  in  a  symmetric  double-well  potential 
(30)  in  the  neglect  of  intrawell  contributions. 


9  =  -<7,  tanh  g  , 


<39) 


where  is  the  unit  step-function.  We  note 
that  the  “amplitude”  q  of  the  rectangular  wave 
(39)  saturates  quite  quickly  as  a  function  of  g 
(starting  with  gal.5),  and  therefore  the  inten¬ 
sities  of  the  spectral  peaks  in  the  SDF  5(<t>)  as 
defined  by  (4a)  depend  only  weakly  on  the  field 
amplitude  A. 


Fig.  12.  The  averaged  coordinate  {q{t))  measured  for  an 
electronic  model  [50]  of  the  overdamped  oscillator  (1),  (24), 
(30),  driven  by  a  periodic  force  y4  cos  fir  with  AI2r  =  {).\, 
Tlir  =  0.0644,  for  a  very  low  frequency  17  =  1.9  x  10  As 
predicted  theoretically  (39),  the  result  approximates  a  square 
wave.  Its  tops  and  bottoms  are  curved  due  to  intra-well 
vibrations,  and  tilted  due  to  the  phase  shift  between  the  latter 
and  the  inter-well  transitions. 


The  nearly  rec  tangular  signal  under  sinusoidal 
driving  by  a  slowly  varying  field  has  been  ob¬ 
served  in  an  analogt :  electronic  experiment  [50]. 
The  result  is  shown  .1  fig.  12.  The  distortion  of 
the  signal  is  related  to  the  contribution  of  the 
forced  intrawell  vibrations.  We  stress  that  the 
periodic  driving  force  was  itself  comparatively 
weak,  so  that  the  nonlinearity  of  this  effect  is 
indeed  quite  remarkably  strong. 

4.3.  Nonconventional"  stochastic  resonance 

Until  recently,  stochastic  resonance  was 
considered  purely  as  an  effect  that  arises  for 
Brownian  motion  in  a  static  bistable  potential 
with  a  superimposed  slowly  varying  field  (cf. 
[5-12,  50,  52]).  It  follows  from  the  above  formu¬ 
lation,  however,  that  it  is  actually  a  quite  general 
phenomenon  for  fluctuating  bistable  systems  in 
the  range  of  a  kinetic  phase  transition.  Conse¬ 
quently,  it  may  also  be  expected  to  occur  for 
systems  displaying  bistability  under  strong 
periodic  driving  [46]  (the  onset  of  a  large  suscep¬ 
tibility  with  respect  to  an  additional  weak  trial 
field  was  predicted  for  systems  of  just  this  kind  in 
[4]).  In  this  latter  case  SR  will  be  inherent  to  the 
response,  not  only  to  a  low-frequency,  but  also 
to  a  high-frequency  field  [54].  Also,  since  bist¬ 
able  systems  are  strongly  nonlinear,  periodic 
driving  of  various  parameters  (not  only  of  their 
coordinates  or  momenta)  can  also  give  rise  to  a 
periodic  signal  (i.e.  to  periodic  variation  of  the 
coordinate),  and  in  some  cases  this  signal  can 
display  a  dome-like  dependence  on  the  noise 
intensity.  One  such  parameter  could  be  the  noise 
intensity  itself  [55].  The  results  for  these  two  new 
types  of  SR  are  described  briefly  below. 

First,  we  consider  high  frequency  stochastic 
resonance  for  periodic  attractors.  As  pointed  out 
above,  the  onset  of  SR  is  related  to  a  compara¬ 
tively  strong  noise-enhanced  modulation  of  the 
populations  of  the  attractors  W|(r),  ^,(1)  by  a 
trial  field.  Since  the  characteristic  time-scale  for 
the  variation  of  the  populations  is  given  by  the 
reciprocal  transition  probabilities,  the  modula- 
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tion  can  be  effective  provided  it  is  slow.  A 
feature  of  nonlinear  systems  is  that  they  perform 
mixing  of  the  frequencies  of  external  fields. 
Therefore,  if  a  system  is  driven  by  a  (strong) 
field  F  cos(wf/  +  <^f)  and  a  (weak)  trial  field 
A  cos(/2t)  then  the  variables  of  the  system  will 
oscillate  at  combination  frequencies  |±najp  +  /2 
(n  =  0, 1, 2, .  .  .)  and  thus  if  one  of  these  is  small 
the  corresponding  oscillations  can  give  rise  to 
eifective  redistribution  over  the  attractors. 

The  simplest  case  is  just  |<«jp  —  /2|  ^  t~1  The 
dynamics  of  the  system  in  this  case  can  be  con¬ 
sidered  as  that  in  the  strong  field 
Re[F(/)  exp(it«)pt  +  i<^p)],  but  with  the  complex 
amplitude  F(t)  slowly  varying  in  time, 


F{t)  =  F  +  A  exp[i(/2  -  <Op)t  -  i<f»p] .  (40) 


The  activation  energies  /?,  2  of  the  transitions 
between  the  attractors  depend  on  F  (strictly,  on 
F\  since  they  are  determined  by  the  intensity 
rather  than  by  the  fast  oscillating  phase  of  the 
field).  For  small )/}  -  a»p|,  they  get  modulated  at 
frequency  -  Wpl  and,  for  a  sufficiently  weak 
trial  field,  Rj  in  the  expression  (12)  should  be 
replaced  by  R,(0  , 


RM  =  R,+ 


dRj 

dF^ 


2AF  cos[(l2  -  a>p)r  -  <^p] 


(41) 


The  further  analysis  of  the  redistribution  over 
the  attractors  is  closely  similar  to  that  in  the 
preceding  subsection.  It  should  be  stressed,  how¬ 
ever,  that  the  modulation  of  the  populations  of 
the  attractors  at  frequency  |f2  -  Wpl  gives  rise  to 
periodic  oscillations,  not  only  at  the  trial-field 
frequency  /2,  but  also  at  the  mirror-reflected 
frequency  \2to^-il\.  For  small  A,  where  the 
linear-response  approximation  holds,  the  trial- 
field-induced  addition  to  the  average  value  of  the 
coordinate  is  of  the  form 


5  ( q{t))  =  A  Re{;ir(/2)  exp(-i/20  +  ;t(/2) 

X  exp[-i(2£«>p  -  /2)/]}  .  (42) 


In  the  case  of  weak  noise,  the  susceptibilities 
x(^)  can  be  written  in  the  form  (26),  and 
the  transition-induced  contributions  are  of  the 
form 

Ar.r(^)  =  ^  <7:(1)] 

^  diR,-R,)  W 

dF^  W-i(/2-Wp)’ 

^tr(^)  =  Ar.r(2wp  -  Q)  exp(-2i<^p) .  (43) 

Both  of  them  display  SR. 

High-frequency  stochastic  resonance  (HFSR) 
of  this  type  has  been  observed  for  periodic  at¬ 
tractors  in  analogue  electronic  experiments  [54]. 
The  system  simulated  was  the  one  already  dis¬ 
cussed  above  in  section  3:  an  underdamped  non¬ 
linear  oscillator  with  a  single-well  potential  given 
by  eq.  (7)  with  A  =  0,  which  has  two  types  of 
coexisting  vibrational  states  under  a  sufficiently 
strong  nearly  resonant  field.  When  the  oscillator 
was  driven,  in  addition,  by  a  trial  field  of  fre¬ 
quency  12  =  a)p  there  occurred  two  clearly  re¬ 
solved  extra  5 -shaped  spikes  in  the  SDF  of  the 
coordinate  5(a)).  The  dependence  of  the  intensi¬ 
ty  of  these  spikes  on  the  noise  intensity  can  be 
seen  from  fig.  13  to  be  just  of  the  SR-type.  The 
theoretical  curves  are  based  on  measured  values 
of  the  activation  energies  of  the  transitions  (cf. 
fig.  9);  the  experimental  uncertainty  arising  from 
the  latter  data  is  shown  by  the  bars.  Given  the 
large  systematic  errors  inherent  in  these  mea¬ 
surements  -  arising  e.g.  from  /3  (17)  which  con¬ 
tains  the  small  difference  between  two  large 
quantities  |a)p-a)„|  raised  to  its  third  power - 
the  agreement  can  be  regarded  as  very  satisfac¬ 
tory;  in  particular,  the  theoretical  and  ex¬ 
perimental  curves  are  of  a  similar  shape,  and 
their  maxima  lie  at  nearly  the  same  T.  Fig.  14 
demonstrates  that  high-frequency  SR  is  a  purely 
critical  phenomenon:  the  intensities  of  the  spikes 
decrease  exponentially  as  the  control  parameter 
f3  (17)  moves  away  from  its  critical  value.  We 
note  that  these  experiments  are  quite  delicate, 
since  an  extremely  high  resolution  is  necessary  to 
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Fig.  13.  High  frequency  stochastic  resonance  for  periodic 
attractors,  measured  for  an  electronic  model  of  the  oscillator 
described  by  (1),  (7)  with  A  =  0,  driven  by  a  strong  periodic 
field  Fcos{ti>ft+  <t>f)  and  a  weak  trial  force  A  cos  fit  [54]. 
The  logarithms  of  the  intensities  S  of  the  ^-shaped  spikes  in 
the  spectral  density  of  the  fluctuations  (a)  at  frequency  fl  and 
(b)  at  2<Of  -  fl  are  plotted  (data  points)  as  a  function  of  the 
noise  intensity  T.  The  curves  are  theoretical  predictions 
based  on  measured  values  of  the  activation  energies;  they  are 
subject  to  the  systematic  uncertainties  indicated  by  the  bars. 
There  are  no  adjustable  parameters. 

observe  and  investigate  the  peaks,  given  that 
they  must  be  separated  by  a  frequency  difference 
much  smaller  than  the  reciprocal  relaxation  time 
which,  in  its  turn,  is  much  smaller  than  the 
frequencies  atp,  O  themselves. 

The  second  nonconventional  form  of  SR  refers 
to  physical  situations  where  the  noise  and  signal 
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Fig.  14.  The  logarithm  of  the  intensity  S  of  the  5-shaped 
spike  at  frequency  fl  in  the  spectral  density  of  the  fluctua¬ 
tions  for  high-frequency  stochastic  resonance  [54],  plotted  as 
a  funetion  of  the  dimensionless  field  intensity  j3  (17)  which 
gives  a  measure  of  the  distance  from  the  kinetic  phase 
transition  line. 

acting  upon  a  system  are  not  additive,  but  mul- 
tiplieative:  it  is  noise  (or  noise  intensity)  that  is 
modulated  directly  by  the  signal,  and  it  may  be 
such  a  periodically  modulated  noise  that  drives 
the  system  itself.  If  the  initial  noise  is  of  zero 
mean,  the  driving  field  will  also  be  of  zero  mean. 
Nonetheless,  a  nonlinear  system  car.  still  detect 
the  modulating  signal  via  nonlinear  transforma¬ 
tions.  It  is  demonstrated  below  that,  for  bistable 
systems,  the  quality  of  detection  may  increase 
with  increasing  noise  intensity  and  display  an 
SR-type  behaviour. 

We  shall  consider  SR  in  the  response  to  mod¬ 
ulated  noise  for  the  simplest  bistable  system:  an 
overdamped  particle  oscillating  in  a  bistable 
potential  and  described  by  the  equation 

2rq  +  U\q)  =  m^ 

^(f)  =  [Mcos(f20  +  l]/(0, 

</(0/(O)=4rr6(f-f').  (44) 

In  contrast  to  the  former  analysis  it  is  the  am¬ 
plitude  of  the  noise  that  is  assumed  to  be 
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periodically  modulated  here;  the  potential  is  as¬ 
sumed  to  be  asymmetric,  with  the  asymmetry 
parameter  A  (the  asymmetry  turns  out  to  be 
crucial  for  obtaining  a  well-pronounced  SR). 

If  the  amplitude  A  is  sufficiently  weak,  we  can 
characterise  the  response  of  the  system  to  the 
corresponding  modulation  in  terms  of  a  general¬ 
ised  susceptibility  X(w)  and  write  the  signal- 
induced  term  in  the  average  value  of  the  coordi¬ 
nate  as 

6< q{t))  =  A  Re[X(f2)  exp(-i/}0]  •  (45) 

For  weak  noise  intensities  and  for  -a  slowly  vary¬ 
ing  field,  D  <  U"l2r,  the  function  N(/2)  (just  as 
for  the  “normal”  susceptibility  xi^))  ‘s  a  sum  of 
contributions  from  the  vibrations  in  the  vicinities 
of  the  stable  states  <7,,  and  from  the  interwell 
transitions  (cf.  (26)): 

K(/2)=  S  w,N,(/2)  +  NV(/]).  (46) 

1=1,2 

The  contribution  from  the  interwell  transitions, 
which  is  the  one  of  primary  interest,  originates 
from  the  fact  that  the  transition  probabilities 
depend  on  the  instantaneous  value  of  the  noise 
intensity  (it  varies  slowly  because  of  the  modula¬ 
tion)  and,  provided  that  the  potential  is  asym¬ 
metric  so  that  the  periodic  variation 

of  gives  rise  to  a  periodic  change  of  the  state 
populations;  for  a  symmetric  potential,  the  vari¬ 
ation  of  the  noise  intensity  does  not  break  the 
symmetry,  and  so  the  populations  remain  equal. 
The  resulting  expression  for  X,^(i7)  is  of  the  form 
[55] 

--J. - fjjrriTj) - ■ 

(47) 

Thus,  a  periodic  signal  will  indeed  occur  under 
driving  by  a  zero-mean  periodically  modulated 
noise  for  an  asymmetric  potential  (At/,  ^  Ai/j); 
furthermore,  the  amplitude  of  the  signal  |X„(f7)| 


is  seen  from  (47)  to  increase  sharply  with  the 
increasing  noise  intensity  T. 

The  dependence  of  the  signal-to-noise  ratio, 
defined  by  analogy  to  (21)  as  the  ratio  of  the 
/‘-shaped  spike  in  the  SDF  5(w)  at  frequency  fl 
to  the  value  of  5(17)  =  Q{fl)  in  the  absence  of 
modulation,  is  shown  in  fig.  15:  the  theoretical 
prediction  is  compared  with  the  results  from  an 
analogue  electronic  experiment  (the  lower  full 
curve  and  square  data  points,  respectively).  The 
phenomenon  of  stochastic  resonance  is  clearly 
evident  in  this  situation,  although  slightly  less 
pronounced  than  for  “conventional”  periodic 
driving  (upper  curve  and  circle  data),  i.e.  driving 
the  system  rather  than  the  noise.  It  is  evident 
from  the  lower  curve  and  data  of  fig.  16  that  this 
type  of  SR  is  intimately  connected  with  the 
asymmetry  of  the  potential  (44),  i.e.  with  the 
finiteness  of  the  parameter  A;  for  A  =  0  the  signal 
could  not  be  detected.  At  the  opposite  extreme 
for  very  strong  asymmetry,  there  will  again  be  no 
SR  because  in  practice  only  one  well  will  be 
populated  for  weak  noise  and  the  interwell  tran- 


Fig.  15,  Stochastic  resonance  for  periodically  modulated 
noise  (55).  Measurements  (xl5)  of  the  signal/noise  ratio  p 
defined  by  (21)  for  an  electronic  circuit  model  of  (44)  with 
/t=0. 14,  A  =  0.2,  /2  =  0.029  are  plotted  (data  points)  as  a 
function  of  the  reduced  noise  intensity  At/ =1/4;  the  full 
curve  represents  the  theoretical  prediction.  The  upper  curve 
and  circle  data  show  the  theory  and  measurements  using  the 
same  circuit  with  additive  periodic  forcing  (conventional  sto¬ 
chastic  resonance)  under  similar  conditions  (theoretical  re¬ 
sults  are  valid  for  T*?  At/,  ,,  strictly  speaking). 
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Fig.  16.  Effect  of  the  asymmetry  parameter  A  on  stochastic 
resonance  [55].  Measurements  (xl5)  of  the  signal/noise 
ratio  p  defined  by  (21)  for  an  electronic  circuit  model  of  (44) 
with  y4=0.15,  (7'/Af/)*.„  =  0.303,  /I  =0.029  are  plotted 
(square  data  points)  as  a  function  of  A;  the  full  curve  repre¬ 
sents  the  theoretical  prediction.  The  upper  curve  and  circle 
data  show  the  theory  and  measurements  using  the  same 
circuit  with  additive  periodic  forcing  (conventional  stochastic 
resonance)  under  similar  conditions. 

sitions  will  be  “frozen  out”.  Therefore,  in  order 
to  investigate  SR  under  these  conditions,  it  is 
necessary  to  optimise  the  asymmetry  of  the 
system. 

5.  Conclusions 

It  follows  from  the  above  results  that  the 
traditional  field  of  noise-driven  dynamics  and,  in 
particular,  investigations  of  the  spectral  densities 
of  fluctuations  of  noise-driven  systems,  is  far 
from  being  exhausted.  There  still  arise  new  and 
unexpected  phenomena  like  the  noise-induced 
narrowing  of  spectral  peaks,  the  onset  of  extra 
peaks  such  as  the  zero-dispersion  and  supemar- 
row  ones,  and  also  stochastic  resonance.  All  of 
these  phenomena  are  very  general.  They  are  not 
“pinned”  to  particular  models,  and  thus  they  are 
of  fundamental  interest.  At  the  same  time,  they 
are  also  rich  in  potential  applications,  ranging 
from  solid  state  physics,  through  electrons  local¬ 


ised  in  Penning  traps,  to  neurons  and  neural 
networks,  as  mentioned  above.  There  still  re¬ 
main  a  number  of  significant  problems  that  have 
not  been  solved,  or  even  addressed.  Many  of 
these  are  related  to  the  interplay  of  dynamical 
chaos  and  noise.  We  hope,  therefore,  that  these 
investigations  will  continue,  and  that  our  under¬ 
standing  of  noise-driven  nonlinear  dynamics  will 
thereby  be  substantially  enriched. 
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The  success  of  current  attempts  to  distinguish  between  low-dimensional  chaos  and  random  behavior  in  a  time  series  of 
observations  is  considered.  First  we  discuss  stationary  stochastic  processes  which  produce  finite  numerical  estimates  of  the 
correlation  dimension  and  K2  entropy  under  naive  application  of  correlation  integral  methods.  We  then  consider  several 
straightforward  tests  to  evaluate  whether  correlation  integral  methods  reflect  the  global  geometry  or  the  local  fractal 
structure  of  the  trajectory.  This  determines  whether  the  methods  are  applicable  to  a  given  series:  if  they  are  we  evaluate 
the  significance  of  a  particular  result,  for  example,  by  considering  the  results  of  the  analysis  of  stochastic  signals  with 
statistical  properties  similar  to  those  of  observed  series.  From  the  examples  considered,  it  is  clear  that  the  correlation 
integral  should  not  be  used  in  isolation,  but  as  one  of  a  collection  of  tools  to  distinguish  chaos  from  stochasticity. 


1.  Introduction 

In  the  past  ten  years,  a  variety  of  methods  to 
extract  phase  space  dynamical  information  from 
experimentally  observed  or  computer  generated 
time  series  have  been  developed,  see  e.g.  refs. 
[1-28].  These  methods  are  generally  based  on  a 
phase  space  reconstruction  (typically  a  “time 
embedding”  procedure,  see  refs.  [22,29])  and 
are  devoted  to  the  calculation  of  the  properties 
of  a  (supposed)  underlying  attractor  (such  as  the 
correlation  dimension  [12,  14,  20,  23,  25],  the  K2 
entropy  [13]  and  the  Lyapunov  exponents  [1,9, 
18,  28]),  to  the  determination  of  the  approximate 
number  of  the  (empirical)  modes  excited  in  the 
system  through  singular  value  decomposition 
(SVD)  [3],  to  the  issue  of  predicting  the  future 
evolution  of  the  system  from  the  knowledge  of 
its  past,  in  a  spirit  which  is  the  extension  of 
classical  autoregressive  (AR)  approaches  [5,  10, 


11,  21,  30],  or  even  toward  reconstructing  the 
equations  of  motion  of  the  system  [7, 10]. 

The  “static”  methods  based  on  the  correlation 
integral  [12-14,  24,  25]  differ  from  prediction 
methods  in  that  the  former  do  not  explicitly  take 
into  account  information  from  the  ordering  of 
the  points  in  the  time  series.  The  methods  men¬ 
tioned  above  provide  information  on  systems 
which  are  known  to  be  dominated  by  low-dimen¬ 
sional  deterministic  dynamics  and  there  exists  a 
noticeable  difference  in  the  results  from  low¬ 
dimensional  chaotic  systems  and  uncorrelated 
(white)  noise.  Applications  to  well-controlled 
laboratory  experiments  have  led  to  determining 
the  presence  of  low-dimensional  chaos  in  several 
experimental  contexts,  see  e.g.  refs.  [31-33]; 
note  that  these  systems  were  characterized  by  a 
limited  degree  of  space  complexity  and  by  the 
ability  to  adjust  control  parameters. 

The  situation  turns  out  to  be  much  more  com- 
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plicated  for  natural  (uncontrolled)  systems,  see 
e.g.  refs.  [34-46],  where  claims  and  counter¬ 
claims  for  low-dimensional  attractors  coexist  for 
the  same  data,  as  well  as  for  systems  dominated 
by  the  presence  of  “colored”  noises  with  power- 
law  power  spectra  [43, 47-49]  or  for  non-linear 
stochastic  processes  [27].  In  this  paper  we  briefly 
review  some  of  the  problems  encountered  in  the 
study  of  systems  characterized  by  the  presence  of 
correlated  stochastic  processes  and  we  discuss  a 
few  simple  tests  which  can  be  of  use  in  the 
attempt  of  distinguishing  between  low-dimen¬ 
sional  (dissipative)  determinism  and  stochastic 
noise. 


2.  Behavior  of  correlated  noises 

The  majority  of  quantitative  attempts  to  detect 
low-dimensional  attractors  from  time  series  data 
have  focused  on  the  estimation  of  the  correlation 
dimension  and  of  the  correlation  entropy  K2. 
Given  a  scalar  time  series  jc(/,),  the  first  step  in 
the  analysis  is  to  employ  an  embedding  proce¬ 
dure  to  reconstruct  the  system  phase  space.  Here 
we  consider  the  method  of  delays  [22,  29],  where 

vector  time  series  is  defined  as 


=  {4^),  x(ti  +  t),  .  .  .  ,  xit^  +  {M-  l)r)}  . 

(2.1) 

Here  t  =  m  A/  is  an  appropriate  time  delay.  At  is 
the  effective  sampling  time  and  M  is  the  dimen¬ 
sion  of  the  vector  x(/j).  Recent  discussions  of 
embedology  are  given  in  refs.  [6, 19].  The  crucial 
idea  underlying  the  embedding  procedure  (2.1) 
is  that  the  observed  variable  x{t)  contains  infor¬ 
mation  on  all  the  other  phase  space  variables  of 
the  system.  In  the  case  of  weakly  coupled  phase 
space  variables,  however,  the  above  method  may 
lead  to  misleading  results  [40].  Note  that  the 
choice  of  the  time  delay  r  is  somewhat  arbitrary 
[17,  50];  there  may  not  even  be  a  unique  good 
selection  criterion  for  this  parameter  [15], 


The  correlation  integral  C\,(r)  of  the  recon¬ 
struction  is  defined  as  [12] 


Cs,{r)  = 


^d{r~  ||x(/,)  -  x(/,)||}  , 


(2.2) 


where  6  is  the  Heaviside  step  function,  N  is  the 
number  of  points  in  the  time  series,  N'  =  N  - 
m(M  —  1)  and  the  vertical  bars  indicate  the  norm 
of  the  vector.  Efficient  implementations  of  (2.2) 
are  available  [14,  24]  and  a  clear  overview  of  the 
analysis  is  presented  in  ref.  [25].  One  is  then 
interested  in  the  scaling  properties  of  the  correla¬ 
tion  integral,  in  particular  whether  Cv,(r)  is  a 
power-law  at  small  scales,  that  is 


(2.3) 


where  we  have  ignored  any  effects  due  to  lacu- 
narity  [2,51].  If  (2.3)  holds,  the  next  step  is  to 
examine  the  behavior  of  the  estimated  correla¬ 
tion  exponent  with  increasing  M.  For  point 
distributions  with  a  low-dimensional  geometry, 
one  may  show  that  at  sufficiently  large  M 
[6,8,  12] 

(2.4) 


where  y  is  the  correlation  dimension;  for  de¬ 
terministic  dynamical  systems,  this  quantity  pro¬ 
vides  an  estimate  of  the  number  of  degrees  of 
freedom  excited  in  the  system.  We  stress  that 
eqs.  (2.3)  and  (2.4)  assume  a  very  large  data  set 
and  that  the  results  should  be  independent  of  t 
over  an  appropriate  range  of  values. 

The  K2  entropy  [13]  may  be  estimated  from 
the  correlation  integrals  C^(r).  This  is  computed 
(in  the  limit  as  M—*^)  as  the  distance  between 
two  successive  correlation  integrals  in  log-log 
coordinates.  Specifically 


C*,,,(r)r^(.  - 


and  then 


(2.5) 


(2.6) 
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The  K2  entropy  is  zero  for  periodic  or  quasi- 
periodic  systems,  it  is  positive  for  chaotic  systems 
and  diverges  for  a  white  noise.  Since  low-dimen¬ 
sional  strange  attractors  do  produce  a  small  and 
usually  non-integer  value  of  the  correlation  di¬ 
mension  and  a  converging  entropy,  the  above 
statements  have  on  occasion  been  reversed  and  a 
finite,  small  value  of  the  correlation  dimension 
and  a  converging  Kn  entropy  have  been  taken  as 
“proof’  of  the  presence  of  a  strange  attractor. 
As  a  counterexample  to  this  belief,  Osborne  and 
Provenzale  [47],  and  Provenzale  et  al.  [48]  have 
shown  that  simple  stochastic  processes,  charac¬ 
terized  by  a  power-law  power  spectrum  with 
random,  independent,  uniformly  distributed 
Fourier  phases,  generate  time  series  with  finite 
correlation  dimension  and  converging  entropy 
estimates,  both  of  which  are  determined  by  the 
logarithmic  spectral  slope. 

The  stochastic  signals  considered  in  refs. 
[47, 48]  are  defined  through  their  Fourier  repre¬ 
sentation,  i.e.,  by 

N/2 

=  S  A(a)J  cos(wtr,  +  «f>*) ,  i  =  l,N  , 

(2.7) 

where  (o^=2‘nklN  At  and  the  are  random 
uncorrelated  phases.  The  “saturation”  value  of 
the  correlation  dimension  v  is  determined  by  the 
logarithmic  spectral  slope  a  through  ir-2/ 
(a  -  1)  for  1  <  a  <  3  [47].  In  addition,  the  nu¬ 
merical  estimates  of  the  K2  entropy  were  found 
to  be  convergent  when  a  >  1  [48].  The  extension 
of  these  results  to  systems  whose  power  spec¬ 
trum  has  a  power-law  behavior  only  on  a  limited 
frequency  range  has  been  considered  by  Theiler 
[49].  A  similar  problem  was  also  considered  by 
Harding  et  al.  [37],  who  studied  a  stochastic 
signal  generated  by  a  random  walk  in  Fourier 
space  which  leads  to  a  finite  value  of  the  correla¬ 
tion  dimension.  Clearly,  the  noises  (2.7)  are  not 
associated  with  any  low-dimensional  system.  The 
above  results  simply  show  that  the  standard 


time-embedding  techniques  and  dimension  and 
entropy  calculations  should  not  be  used  without 
a  careful  evaluation  of  the  conditions  for  their 
applicability  and  an  examination  of  the  con¬ 
sistency  of  the  results  obtained.  A  naive  applica¬ 
tion  of  these  methods  may  lead  to  erroneous 
conclusions. 

The  colored  noise  example  discussed  above  is 
certainly  not  the  only  class  of  random  noises 
which  give  a  finite  estimate  of  i'  and  a  convergent 
K2  when  finite-time  series  are  considered 
[27,49].  Following  Vio  et  al.  [27],  we  consider 
the  two  time  series  generated  by  a  linear  and  by 
a  non-linear  stochastic  process,  given  respective¬ 
ly  by 

^  =  0x(t)  -t-  w(r) ,  (2.8) 

^  ={a-  0.5)^  -  yir)  -f  [2i8>-(r)]'  -w{t) , 

(2.9) 

where  »v(r)  is  a  standard  gaussian  white  noise 
process.  Two  such  time  series  are  shown  in  figs, 
la  (linear  case)  and  lb  (non-linear  case)  for  the 
parameter  values  0  =  -0.9  and  a  =  /3  =  1.  The 
discrete  version  of  the  process  in  formula  (2.8)  is 
a  classical  AR(1)  linear  model;  the  numerical 
integration  of  eq.  (2.9)  is  pursued  by  the  local 
linearization  method  of  Ozaki  [52]  with  At  =  0.02 
for  both  series.  Unless  noted  otherwise,  we  con¬ 
sider  time  series  composed  of  N  =  4000  data 
points.  This  corresponds  to  a  length  of  about  15 
correlation  times,  similar  to  signals  commonly 
encountered  in  the  study  of  natural  systems. 

Both  the  linear  and  the  non-linear  processes 
(2.8)  and  (2.9)  generate  stationary  time  series  for 
parameters  values  in  an  appropriate  range  [27]. 
It  is  important  to  note  that  both  the  linear  and 
the  non-linear  time  series  possess  very  similar 
power  spectra  (a  power-law  power  spectrum 
P(a>)  =  w“'  over  a  large  frequency  range)  and 
very  similar  structure  functions  (defined  below). 
As  shown  in  ref.  [27],  the  two  series  differ  in  that 
the  linear  signal  x(t)  is  statistically  self-similar 
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Fig.  1.  (a)  Time  series  obtained  from  the  linear  stochastic 
process  (2.8)  with  0  =  -0.9  and  A/  =  0.02.  (b)  Time  series 
generated  by  the  non-linear  stochastic  process  (2.9)  with 
a  =  0  =  1  and  At  =  0.02. 

(an  homogeneous  fractal  signal),  while  the  non¬ 
linear  signal  y{t)  is  multifractal  and  intermittent. 
This  is  reflected  by  the  fact  that  the  Fourier 
phases  of  the  linear  system  (2.8)  are  random 
uniformly  distributed  with  no  correlation  with 
each  other,  while  some  of  the  Fourier  phases  of 
the  non-linear  signal  (2.9)  have  a  non-zero  corre¬ 
lation,  as  revealed  for  example  by  the  bispec¬ 
trum  analysis.  This  implies  that  the  stochastic 
signal  given  by  formula  (2.8)  has  close  analogies 
with  those  studied  in  refs.  [47,48],  while  the 
signal  (2.9)  is  a  truly  new  entity. 

Figs.  2a  and  2b  report  the  correlation  integrals 
for  the  linear  and  non-linear  time  series  respec¬ 
tively;  figs.  2c  and  2d  show  the  correlation  expo¬ 
nent  and  Kj^M)  versus  the  embedding  dimen¬ 
sion  for  the  two  time  series,  as  computed  by 
linear  least-squares  fit  of  log  C„(r)  versus  log  r 


on  the  scaling  range  0.002^  Cv,(r)  <  0.02.  Note 
the  “knee  "  in  the  correlation  integrals  for  large 
A/,  at  a  value  C^(f)~0.02.  consistent  with  the 
results  by  Theiler  [49]  on  stationary  random 
processes  with  a  power-law  spectrum  on  a  finite 
frequency  range.  The  procedure  of  phase  space 
reconstruction  and  the  subsequent  dimension 
and  entropy  calculations  give  the  similar  value 
u~2.5  (at  embedding  dimension  M  =  8)  and  an 
equally  converging  entropy  for  both  time 
series,  independent  on  the  linear  or  non-linear 
nature  of  the  signals.  For  both  signals,  the  time 
delay  t  used  in  the  time  embedding  procedure, 
T  =  250  Al,  is  near  the  first  zero  of  the  autocorre¬ 
lation  function.  In  any  event,  the  convergence  of 
the  correlation  dimension  and  of  the  entropy 
does  not  significantly  depend  on  the  choice  of 
the  time  delay  over  a  large  range  of  values  of  r. 

The  computed  value  of  the  correlation  dimen¬ 
sion  for  both  signals  is  slightly  larger  than  the 
value  indicated  by  the  expression  f  =  2/ 
(a  -  1)  =  2  when  a  =2.  This  is  due  to  the  fact 
that  the  power  spectrum  of  the  signals  (2.8)  and 
(2.9)  tend  to  become  flat  at  low  frequencies, 
consistent  with  the  stationary  nature  of  the  pro¬ 
cesses  [27].  When  the  length  of  the  time  series 
increases,  the  noises  (2.8)  and  (2.9)  tend  to 
become  space-filling,  as  required  for  stationary 
stochastic  processes.  However,  this  convergence 
is  slow,  and  for  a  finite  number  of  points  an 
apparently  finite  estimate  of  the  correlation  di¬ 
mension  is  typically  obtained.  Increasing  the 
length  of  the  signal  produces  somewhat  larger 
estimates  of  the  dimension.  Clearly,  an  increase 
of  the  dimension  estimates  with  the  length  of  the 
time  series  should  warn  about  misleading  conclu¬ 
sions;  unfortunately,  such  a  test  is  often  not 
available  in  the  study  of  natural  systems. 

In  the  case  of  noises  with  a  power-law  spec¬ 
trum  and  a  low-frequency  cutoff  (below  which 
the  spectrum  becomes  flat),  Theiler  [49]  has 
recently  derived  an  analytic  expression  for  the 
correlation  integral;  the  existence  of  different 
scaling  regimes  at  different  scales  has  been  de¬ 
tected,  the  fractal  behavior  being  associated  with 
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Fig.  2.  (a)  and  (b)  report  the  correlation  integrals  C^,(r)  for  the  linear  (2.8)  and  non-linear  (2.9)  stochastic  processes  shown  in  fig. 
1.  The  time  delay  t  in  the  embedding  procedure  has  been  chosen  to  be  t  =  250  At.  approximately  corresponding  to  the  first  zero 
of  the  autocorrelation  of  the  signals.  The  embedding  dimension  varies  from  Af  =  I  to  Af  =  8.  (c)  and  (d)  report  the  correlation 
exponent  and  the  correlation  entropy  versus  the  embedding  dimension  Af  for  both  the  linear  and  non-linear  signals. 

Crosses  refer  to  the  linear  signal  and  circles  to  the  non-linear  process.  The  error  bars  on  are  the  95%  confidence  limits  of  the 
least-squares-fit  slope  of  log  C„(r)  versus  log  r;  the  error  bars  on  K;{M)  are  the  standard  deviation  on  the  mean  value  of 
[log  C„(r)  -  log  |(r)]/Ar  in  the  scaling  range. 


the  scale  range  where  the  spectrum  is  power-law. 
Theiler  also  estimated  a  lower  bound  to  the 
number  of  points  required  in  order  to  observe 
the  space-filling  scaling  regime;  “may  have  to 
be  extremely  large  for  this  regime  to  be 
achieved”.  The  power  spectral  properties  of  the 
noises  considered  in  the  present  paper  are  simi¬ 
lar  to  those  discussed  in  ref.  [49];  note,  however, 
that  the  signal  generated  by  (2.9)  has  been  de¬ 
fined  through  a  non-linear  stochastic  differential 
equation  (not  by  its  spectrum)  and  that  its 
Fourier  phases  are  not  independent.  Tests  for 


the  presence  of  non-linearity  (such  as  the  BDM 
test  [53])  should  give  a  positive  result.  Neverthe¬ 
less,  the  present  results  show  that  finding  that  a 
time  series  is  non-linear  is  certainly  not  sufficient 
to  infer  the  presence  of  low-dimensional  de¬ 
terministic  dynamics. 

Another  class  of  random  processes  which  pro¬ 
vide  a  finite  correlation  dimension  estimate  is 
obtained  by  considering  a  white  noise  with  ran¬ 
domly  superposed  jumps  of  random  amplitude  (a 
random  saw-tooth).  One  realization  of  such  a 
process  is  shown  in  fig.  3a.  In  this  example,  the 
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Fig.  3.  (a)  The  “random  saw-tooth”  signal  obtained  by 
superposing  random  jumps  of  large  amplitude  onto  a  white 
noise  background,  (b)  Correlation  integrals  C„(r)  for  the 
signal  in  fig.  3a;  t  =  100  At  and  M  =  \ . 8. 

“fast”  dynamics  is  a  white  noise  uniformly  dis¬ 
tributed  between  -1  and  1;  we  superposed  onto 
it  a  random  number  of  consecutively  positive  and 
negative  jumps  with  amplitude  Aj  =  Aq  +  ti^, 
where  /Ig  =  10  and  is  uniformly  random  dis¬ 
tributed  in  (-1, 1).  The  average  time  separation 
between  jumps  is  about  100  A r.  The  correlation 
integrals  obtained  from  this  time  series  are 
shown  in  hg.  3b;  again,  a  value  of  the  time  delay 
corresponding  to  the  first  zero  of  the  autocorre¬ 
lation  function  has  been  chosen  (r  =  100  Af,  here 
Af  =  1).  A  scaling  regime  in  C^,(r),  with  saturat¬ 
ing  correlation  exponent,  is  clearly  visible  at 


large  values  of  r  for  the  higher  embedding  di¬ 
mensions,  leading  to  a  correlation  dimension 
v~2.1.  A  similar  structure  was  observed  by 
Voges  et  al.  [46]  in  the  analysis  of  the  X-ray 
variability  of  Hercules  X-1  (their  fig.  5);  this 
behavior  was  ascribed  to  a  two-amplitude-range 
process,  where  the  low-amplitude  fluctuations 
are  due  to  high-dimensional  dynamics  and  the 
large-amplitude  fluctuations  determined  by  a 
low-dimensional  chaotic  dynamics.  However, 
such  a  behavior  can  be  simulated  by  a  purely 
stochastic  process,  as  in  the  example  above. 

3.  Some  tests  toward  the  goal  of  distinguishing 
between  some  chaos  and  some  noise 

The  examples  given  in  the  previous  section 
suggest  that  the  distinction  between  low-dimen¬ 
sional  dissipative  chaos  and  (correlated)  random 
noise  should  not  be  based  solely  on  correlation 
dimension  estimates.  In  addition  to  those  consid¬ 
ered  here,  other  types  of  stochastic  processes 
certainly  exist  which  mimic  the  properties  of 
low-dimensional  chaos  in  finite  data  sets.  Meth¬ 
ods  other  than  dimension  calculations  should  be 
applied  to  measured  time  series  in  order  to 
extract  as  much  dynamical  information  as  pos¬ 
sible.  In  this  regard,  however,  we  recall  that  also 
predictability  algorithms  may  have  difficulty  in 
distinguishing  between  chaos  and  correlated 
noise  when  a  finite  number  of  points  is  consid¬ 
ered  [21]. 

In  a  sense,  simply  examining  the  time  series 
and  its  recurrence  plots  often  indicates  whether  a 
meaningful  correlation  integral  analysis  can  be 
performed  (more  precisely,  such  an  examination 
often  indicates  the  analysis  should  not  be  per¬ 
formed).  For  a  system  believed  to  contain  a 
very-low-dimensional  attractor  (say,  dimension 
less  than  three),  one  can  directly  inspect  phase 
space  trajectories  and  Poincare  sections;  if  these 
yield  either  “messy”  distributions  with  no  dis¬ 
cernible  structure  or  isolated,  non-recurrent 
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patches  of  points,  the  correlation  integral  esti¬ 
mates  should  be  interpreted  with  extreme  cau¬ 
tion.  Analogously,  time  series  which  show  rare 
bursts  in  otherwise  unexplored  regions  of  phase 
space,  as  well  as  an  obvious  non-stationarity 
and/or  the  absence  of  close  returns  in  phase 
space  are  not  promising  candidates  for  the  search 
of  low-dimensional  dissipative  chaos. 

Clearly,  real  time  series  are  a  mixture  of  de¬ 
terministic  components  and  random  noise;  it  is 
nevertheless  of  some  interest  to  attempt  to  disen¬ 
tangle  the  two  components  (when  possible).  In 
this  section  we  discuss  simple  tests  which  may  be 
applied  to  an  experimental  time  series  in  order 
to  interpret  correlation  dimension  estimates  and 
distinguish  low-dimensional  dynamics  from  sto¬ 
chastic  processes.  These  tests  are  based  on  the 
idea  of  modifying  some  of  the  properties  of  a 
time  series  (i.e.,  on  generating  appropriate  “sur¬ 
rogate”  data,  in  a  language  similar  to  that  used 
in  refs.  [26,  30]),  in  order  to  determine  whether 
the  convergence  of  the  dimension  and  of  the 
entropy  (or  some  other  measured  quantity)  does 
or  not  depend  on  the  modified  property. 

It  is  important  to  stress  the  fact  that  each  test 
in  se  is  not  an  absolute  proof;  at  best  we  are  able 
to  evaluate  the  probability  that  a  series  would 
produce  the  observed  result  by  chance  if  it  were 
chosen  from  an  ensemble  of  signals  with  some 
given  set  of  properties.  These  properties  are 
chosen  in  an  attempt  to  fool  the  algorithm  tested 
and  the  usefulness  of  the  test  depends  on  the 
choice  of  good  surrogate  signals.  The  com¬ 
parison  between  several  of  the  above  approaches 
increases  the  confidence  in  a  distinction  between 
chaos  and  noise.  Finally,  we  recall  that  we  tend 
to  include  in  the  term  “randomness”  the  be¬ 
havior  of  a  dynamical  system  which  cannot  be 
represented  in  terms  of  a  few  active  degrees  of 
freedom,  but  which  must  instead  be  character¬ 
ized  by  a  large  number  of  excited  modes.  The 
definitions  of  “few”  and  “large  numbers”  are 
vague  and  will  depend  on  the  level  of  technology 
and  the  theory  available.  This  means  that  one 


will  (mis)classify  sufficiently  “high”  dimensional 
chaos  as  randomness. 

3.1.  Space-time-separation  plots 

The  first  test  simply  recasts  the  data  in  the 
correlation  integral  to  make  the  bias  due  to 
dynamical  correlations  more  obvious.  We  recall 
that  Theiler  [23]  demonstrated  that  short-time 
correlations  can  produce  “knees”  in  the  correla¬ 
tion  integral  due  to  the  one-dimensional  nature 
of  the  trajectory.  Analysis  of  fractal  trajectories 
may  result  in  similar  knees  with  non-integer 
dimension. 

The  correlation  integral  represents  the  prob¬ 
ability  that  a  pair  of  randomly  chosen  points  on 
the  reconstruction  will  be  less  that  a  distance  r 
apart.  When  making  the  standard  calculations, 
one  assumes  the  distance  between  pairs  of  points 
is  due  to  the  geometry  of  the  reconstruction,  not 
because  the  points  are  dynamically  correlated 
and  their  separation  in  space  reflects  their  being 
neighbors  in  time.  These  temporal  correlations 
led  Theiler  to  restrict  the  sums  in  eq.  (2.2)  to  /,  j 
pairs  where  |  /  -  /|  >  VF  for  some  constant  W.  The 
graphs  presented  below  may  be  interpreted  as 
providing  a  method  for  choosing  W;  in  the  case 
of  non-stationary  power-law  noises  they  indicate 
that  there  is  no  value  of  W  for  which  the  correla¬ 
tion  integral  reflects  global  scaling  due  to  recur¬ 
rence. 

For  reconstructions  from  a  single  time  series, 
each  pair  of  points  on  the  reconstruction  is  sepa¬ 
rated  in  phase  space  by  some  distance  r  and  in 
time  by  some  At.  Our  approach  is  to  consider  the 
time  separation  of  points  explicitly,  first,  through 
a  scatter  plot  of  the  separation  between  two 
points  in  the  space  against  their  separation  in 
time.  This  is  illustrated  for  a  three-dimensional 
reconstruction  of  the  x  series  from  the  Lorenz 
equation  [54]  with  o-  =  10,  6  =  j  and  r  =  24.74;  in 
fig.  4  where  the  horizontal  axis  is  separation  in 
time  and  the  vertical  axis  is  the  base  2  logarithm 
of  the  separation  in  space.  For  small  At  points 
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Fig.  4.  Scatter  plot  of  the  spatial  separation  versus  time 
separation  between  pairs  of  points  on  a  trajectory  on  the 
Lorenz  attractor.  The  horizontal  axis  is  separation  in  time 
and  the  vertical  axis  is  the  (base  2  logarithm  of)  the  separa¬ 
tion  in  space. 


are  always  near  neighbors  in  space,  as  their  time 
separation  increases  so,  initially,  does  their  sepa¬ 
ration  in  space. 

For  large  data  sets,  scatter  plots  are  difficult  to 
interpret.  An  alternative  is  to  plot  contour  maps 
of  the  fraction  of  points  closer  than  a  distance  r 
at  a  given  time  separation  At  as  a  function  of  Af, 
equivalently  F(|jc(/ -t- Af)  -  x(f)|  <  r)  for  arbi¬ 
trary  t.  For  large  N  and  A/  (in  systems  in  which 
correlations  decay  with  time),  this  distribution 
converges  to  the  correlation  integral  for  each  Af. 
The  purpose  of  these  contour  maps  is  to  observe 
the  manner  in  which  this  convergence  comes 
about. 

Fig.  5  shows  the  space-time-separation  con¬ 
tour  map  for  the  Lorenz  case  shown  in  fig.  4. 
The  first  zero  of  the  autocorrelation  function 
corresponds  to  295  in  the  integer  units  of  the 
graph.  Fig.  5b  shows  the  distribution  over  longer 
time  scales.  The  length  of  time  for  which  mem¬ 
ory  effects  are  significant  is  surprisingly  long. 
The  correlation  integral  is  usually  computed  in¬ 
cluding  these  time  separations  with  the  implicit 


assumption  that  the  visible  oscillations  average 
out. 

Fig.  6  shows  the  corresponding  results  for  a 
1// power-law  noise.  It  is  clear  in  this  example 
that  the  only  points  with  small  spatial  separation 
are  dynamically  near  neighbors:  The  series  is 
non-recurrent  in  phase  space.  The  key  point  here 
is  that  there  is  no  analogue  of  fig.  5b  for  the 
power-law  noise  signal:  There  exist  no  time 
scales  on  which  the  distribution  is  stable.  As  the 
correlation  integral  effectively  projects  this  graph 
onto  the  vertical  axis,  biased  estimates  of  the 
correlation  integral  will  result  when  the  contribu¬ 
tions  of  this  projection  are  disturbed  by  structure 
at  small  t.  In  the  plots  for  the  power-law  noise 
this  is  always  the  case;  whatever  time  threshold 
is  chosen,  the  smallest  length  scales  will  always 
be  dominated  by  the  smallest  time  scales.  For 
the  Lorenz  attractor,  Theiler's  approach  removes 
the  contribution  of  the  region  |/-/|<W;  from 
fig.  5a  it  is  clear  that  for,  say.  W  <  32  the  dis¬ 
tribution  contains  many  near  neighbors  due  to 
dynamical  correlations.  For  a  chaotic  system,  the 
decay  of  correlations  with  time  results  in  the 
convergence  of  slices  at  constant  A/  to  the  corre¬ 
lation  integral  at  large  Ar.  The  memory  of  initial 
conditions,  reflected  here  in  the  persistence  of 
long  time  structure  of  this  plot  is  greater  than 
might  have  been  expected. 

The  connection  with  the  correlation  integral  is 
straightforward:  C(r)  is  simply  the  sum  over 
“large”  At  for  a  given  r;  the  usefulness  of  this 
graph  is  that  (a)  it  provides  a  quantitative  esti¬ 
mate  of  what  constitutes  “large  At”  (namely 
those  values  where  the  contours  have  reached 
their  asymptotic  behavior),  (b)  it  is  sensitive  to 
the  specific  reconstruction  parameters  used  and 
the  full  non-linear  structure  in  M  dimensions  as 
opposed  to  the  (linear)  autocorrelation  function 
or  the  one-dimensional  mutual  information,  and 
(c)  computationally,  it  is  a  subset  of  the  correla¬ 
tion  integral.  Note  that  these  distributions  may 
also  be  used  to  estimate  the  inside  cutoff  to 
scaling  range  in  the  spatial  separation  of  points 
with  minimal  dynamic  correlation. 
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Fig.  S.  Space-time-separation  plots  for  the  Lorenz  attractor  as  in  hg.  4.  In  this  case  the  scatter  diagram  is  replaced  by  a  contour 
map  at  short  time  scales  (a)  and  at  longer  time  scales  (b).  The  contours  indicate  the  fraction  of  points  closer  than  a  distance  r  at  a 
given  time  separation  At.  The  different  curves  correspond  to  different  fractions;  curve  I  refer  to  a  fraction  of  1%.  curve  2  to  U)%, 
curve  3  to  50%,  curve  4  to  90%  and  curve  5  to  99%. 
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Fig.  6.  Space-time-separation  contour  map  for  a  1// power-law  noise,  as  a  function  of  A/.  Same  details  as  in  fig.  5. 


For  very  long  series  with  low  sampling  rates, 
the  effects  discussed  here  become  small  as  one 
eventually  finds  near  returns  closer  to  a  given 
point  than  its  dynamical  next  neighbor.  For 
reasonable  sampling  rates,  however,  the  data  set 
lengths  required  for  this  to  occur  can  be  ex¬ 
tremely  long. 

The  power-law  noise  is  employed  here  as  a 
clear  example  of  a  series  which  is  non-recurrent 
in  phase  space.  In  this  case,  the  non-stationarity 
of  the  signal  should  be  obvious  by  inspection  of 
the  time  series  itself.  The  space-time-separation 
maps  quantify  the  occurrence  (or  absence)  of 
near  returns  in  more  subtle  time  series. 

Finally,  we  note  a  secondary  bias  in  the  corre¬ 
lation  integral  when  high  sampling  rates  are 
used.  Even  when  near  neighbors  of  a  given  point 
are  omitted  from  the  calculation  centered  at  that 
point,  they  can  still  bias  the  probability  dis¬ 
tribution  centered  on  points  far  away  in  time. 
This  appears  as  a  change  in  the  conditional 
probability  <  r -(- Ar  |  <  r)  through 


the  correlation  of  x(/y_|)  and  x(tj)  for  arbitrarily 
large  values  of  i  -  j. 

3.2.  Phase  randomization 

A  very  useful  test  is  to  consider  the  dis¬ 
tribution  of  the  Fourier  phases  of  the  signal 
under  study.  In  fact,  in  the  case  of  fractal  noise 
processes  the  convergence  of  the  correlation  di¬ 
mension  is  forced  mainly  by  the  shape  of  the 
power  spectrum  (consistent  with  the  fact  that 
both  the  power  spectrum  and  the  correlation 
integral  are  related  to  the  second  moment  of  the 
distribution),  while  for  a  low-dimensional 
dynamics  phase  correlations  play  an  essential 
role.  Thus,  given  an  experimentally  measured 
signal  x(t)  thought  to  be  chaotic,  it  is  useful  to 
consider  the  stochastic  surrogate  signals  obtained 
by  inverting  a  power  spectrum  exactly  equal  to 
that  of  the  signal  under  study  and  random,  in¬ 
dependent  and  uniformly  distributed  Fourier 
phases.  If  the  convergence  is  determined  only  by 
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the  shape  of  the  spectrum  (equivalently,  by  the 
autocorrelation  function),  then  the  results  are 
not  affected  by  phase  randomization;  the  in¬ 
variance  of  the  correlation  dimension  and  en¬ 
tropy  estimates  under  phase  randomization 
strongly  implies  that  these  estimates  are  not 
indicative  of  low-dimensional  dynamics.  To  our 
knowledge,  this  test  was  first  applied  in  the  study 
of  the  motion  of  freely  drifting  buoys  in  the 
Pacific  Ocean  [43]. 

As  an  example  of  this  approach,  in  fig.  7a  we 
show  the  time  series  obtained  by  randomizing 
the  Fourier  phases  of  the  non-linear  stochastic 
process  shown  in  fig.  lb,  and  in  fig.  7b  we  report 
the  corresponding  correlation  integrals.  The  dif¬ 
ference  between  the  original  and  the  phase- 
randomized  time  series  is  visually  apparent,  since 
the  phase  randomization  has  destroyed  the  deli¬ 
cate  phase  couplings  associated  with  the  inter¬ 
mittent  and  multifractal  nature  of  the  process 
(2.9).  The  average  correlation  dimension  esti¬ 
mates,  however,  do  not  show  any  significant 
differences  between  the  original  (non-liner)  and 
the  phase-randomized  (linearized)  signal,  as 
shown  in  fig.  7c,  which  reports  versus  M  for 
the  two  time  series.  For  both  signals,  the  correla¬ 
tion  dimension  saturates  at  approximately  the 
same  value.  By  repeating  this  procedure  over  an 
ensemble  of  ten  different  surrogate  signals,  cor¬ 
responding  to  different  choices  of  the  random 
phases,  we  have  always  obtained  saturating  cor¬ 
relation  dimension  estimates  with  mean  value 
{v)  =2.65  and  standard  deviation  =0.17  (at 
embedding  dimension  M  =  S).  In  general,  we 
have  noted  that  the  scatter  in  the  saturation 
values  of  obtained  for  an  analogous  ensemble 
of  ten  surrogates  of  the  linear  signal  (2.8)  is 
smaller  ((»')=  2.47  and  <t^  =  0.08  at  A/  =  8);  this 
difference,  however,  is  in  general  not  sufficient 
to  infer  the  linear  or  non-linear  nature  of  the 
original  signal. 

The  above  results  can  be  understood  by  recal¬ 
ling  that  the  correlation  dimension  is  related  to 
the  second  moment  of  the  probability  distribu¬ 
tion  associated  with  the  time  series  (in  a  given 


Fig.  7.  (a)  Signal  obtained  by  randomizing  the  Fourier 
phases  of  the  non-linear  time  series  shown  in  fig.  lb.  (b) 
Correlation  integrals  for  the  phase-randomized  time  series 
shown  in  fig.  7a;  t  =  250  Ar  and  M  =  1, ....  8.  (c)  Correla¬ 
tion  exponent  v„  versus  the  embedding  dimension  M  for  the 
original  (crosses)  and  phase  randomized  (circles)  time  series. 
Error  bars  are  the  95%  confidence  limits  on  the  least-squares 
fit. 

embedding  space),  i.e.,  it  is  related  to  the  auto¬ 
correlation  or  to  the  power  spectrum.  For  a 
stochastic  signal,  the  phase  information  deter¬ 
mines  the  behavior  of  higher  moments,  i.e.,  it  is 
related  to  the  generalized  fractal  dimensions  (as- 
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sociated  with  the  intermittency  properties)  and 
to  higher-order  spectral  quantities  such  as  the 
bispectrum.  All  these  higher-order  measures  are 
obviously  modified  by  phase  randomization  in 
the  case  of  the  non-linear  signal.  Along  these 
lines,  a  useful  test  to  detect  the  presence  of 
non-linearity  and  phase  correlations  in  a  given 
stochastic  signal  is  to  verify  how  the  spectrum  of 
multifractal  dimensions  is  changed  under  phase 
randomization  [27].  Clearly,  for  the  purpose  of 
distinguishing  between  chaos  and  noise  one 
should  attempt  to  choose  stochastic  surrogates 
which  resemble  the  original  series  as  much  as 
possible  (even  in  the  higher  moments).  For  ex¬ 
ample,  the  sunspot  series  has  a  number  of  non¬ 
linear  characteristics;  in  this  case,  rather  than 
simply  use  phase  randomization  one  might  con¬ 
sider  non-linear  stochastic  model  simulations,  as 
discussed  in  refs.  [30,  55]. 

On  the  other  hand,  if  the  convergence  is  due 
to  an  underlying  deterministic  dynamics,  phase 
randomization  destroys  the  convergence  of  the 
dimension  and  of  the  entropy  estimates.  As  an 
example,  figs.  8a  and  8b  report  a  4000-point  time 
series  of  the  x  component  from  the  Lorenz  at¬ 
tractor,  with  the  same  parameters  as  above, 
before  and  after  the  phase  randomization.  The 
time  step  used  is  At  =  0.05.  The  appearance  of 
the  two  time  series  is  completely  different  and 
the  correlation  dimension  results  differ  as  well, 
see  figs.  8c  and  8d.  In  fact,  there  is  no  clear 
scaling  range  in  C^Cr)  for  the  phase-randomized 
signal,  as  shown  in  figs.  8e  and  8f  which  report 
the  local  logarithmic  slope  of  C^,(r)  for  each 
case.  For  the  phase-randomized  signal,  may 
be  defined  as  an  average  slope  over  a  specified 
range  of  length  scales.  Fig.  8g  shows  versus  M 
for  both  signals;  as  one  can  see,  the  average 
correlation  exponent  for  the  phase-randomized 
signal  does  not  saturate.  By  repeating  the  analy¬ 
sis  over  an  ensemble  of  ten  surrogate  signals  we 
have  always  obtained  non-convergent  correlation 
dimension  estimates  for  the  phase-randomized 
signals. 

We  caution,  however,  that  a  change  in  the 


correlation  integrals  under  phase  randomization 
does  not  necessarily  imply  the  existence  of  an 
underlying  strange  attractor.  For  example,  phase 
randomization  of  signals  with  strong  periodic  or 
quasi-periodic  components  in  their  spectrum  will 
be  more  difficult  to  interpret.  In  principle,  quasi- 
periodic  signals  with  the  geometry  of  tori  can  be 
detected  by  their  integer  dimension.  In  reality, 
however,  the  uncertainty  in  dimension  estimates 
makes  identifying  integers  impractical.  Other 
tests  better  designed  to  identify  signals  such  as 
these  exist  (see  e.g.  ref.  [56]).  The  combined  use 
of  several  methods  is  often  a  crucial  step  in  the 
correct  analysis  of  the  source  of  difficulties  with 
correlation  integral  techniques.  Alternatively, 
the  use  of  some  noise-filtering  algorithms 
[11, 15, 16]  may  help  in  elucidating  the  true  na¬ 
ture  of  the  system,  although  non-linear  cleaning 
should  be  kept  distinct  from  “bleaching"  the 
data  [26]. 

3.3.  Signal  differentiation 

Another  test  considers  the  correlation  integral 
analysis  of  the  first  (numerical)  derivative  of  the 
signal.  For  a  system  governed  by  a  low-dimen¬ 
sion  strange  attractor,  the  value  of  the  correla¬ 
tion  dimension  is  the  same  for  the  original  signal 
as  well  as  for  the  first  (or  for  a  higher)  derivative 
(note  that  the  time  delay  may  have  to  be  modi¬ 
fied).  In  the  case  of  a  stochastic  signal,  The  first 
derivative  (or  difference)  of  the  signal  has  a 
correlation  dimension  which,  when  well  defined, 
is  often  much  larger  than  that  of  the  original 
signal,  consistent  with  the  change  in  the  logarith¬ 
mic  spectral  slope  under  signal  differentiation. 
Fig.  9a  reports  the  first  difference  signal  Ax{t)  = 
x{t  At)  -  x{t)  of  the  X  component  of  the 
Lorenz  attractor;  fig.  9b  reports  the  correlation 
integrals  for  Ajc(f)  and  fig.  9c  shows  the  values  of 
the  correlation  exponent  versus  the  embedding 
dimension  for  the  original  time  series  and  for  the 
first  difference  signal  Ajt(/).  The  same  value  of  t 
has  been  used  for  both  time  series  (t  =  5  At. 
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Fig.  8.  (a)  and  (b)  report  respectively  a  time  series  obtained  from  the  three-dimensional  Lorenz  model  [28]  and  its  phase- 
randomized  counterpart;  we  use  the  standard  values  o-  =  10,  b  =  8/3  and  r  =  24.74.  A  similar  behavior  is  obtained  for  r  =  28.  The 

time  step  is  At  =  0.05.  (c)  and  (d)  report  the  corresponding  correlation  integrals  with  t  =  5  At  and  M  =  1 . 8;  (c)  and  (f)  report 

the  local  logarithmic  slopes  of  the  correlation  integrals  as  obtained  from  a  moving  five-point  linear  regression  of  log  Cv,(r)  versus 
log  f-  (g)  shows  the  (average)  correlation  exponent  v,,  versus  M  for  the  original  (crosses)  and  phase-randomized  (circles)  time 
series. 
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Fig.  10.  (a)  reports  the  first  difference  signal  obtained  from 
Fig.  9.  (a)  reports  the  first  difference  signal  Ax(r)-jc(r+  Ijjg  non-linear  stochastic  process  (2.9).  a  =  ^  =  1  and  A/ = 

^0  ~  ^(0  obtained  from  the  signal  in  fig.  8a,  Ar-0.05.  (b)  0.02.  (b)  reports  the  corresponding  correlation  integrals  for 

reports  the  correlation  integrals  for  this  signal,  with  t  =  5  Ar  t  =  250  A/  and  M  =  1 . 8.  (c)  shows  the  correlation  cxpo- 

and  M  —  1. ...  ,8,  and  (c)  reports  the  correlation  exponents  versus  the  embedding  dimension  for  the  original  (cross- 

•'«  versus  M  for  the  original  (crosses)  and  first  difference  3^^  differenced  (circles)  signals.  From  panel  (a)  the 

(circles)  signals.  intermittent  nature  of  the  process  (2.9)  is  particularly 

evident. 

where  Af  =  0.05);  as  expected,  results  are  very 

similar.  been  used  for  both  signals.  As  one  can  see,  no 

The  first  difference  signal  Ay(t)  =  y(t  +  At)  -  saturation  is  observed  in  the  correlation  expo- 

>>(/)  obtained  from  the  non-linear  stochastic  pro-  nent  of  the  difference  signal.  This  is  due  to  the 

cess  (2.9)  and  the  resulting  analysis  are  shown  in  fact  that  the  increments  Ay(t)  have  essentially  a 

fig.  10;  again,  the  same  time  delay  r  =  250  Af  has  white  noise  spectrum  in  this  case,  and  have 
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consequently  a  much  shorter  correlation  time. 
Thus,  when  a  clear  estimate  of  the  derivative  of 
a  signal  is  available,  the  existence  of  a  strong 
difference  in  the  correlation  dimension  between 
the  measured  signal  and  its  first  derivative  is  a 
good  indication  that  the  dynamics  has  a  signifi¬ 
cant  stochastic  component.  Conversely,  if  the 
results  of  the  correlation  analysis  do  not  change 
under  signal  differentiation,  one  has  strong  indi¬ 
cation  that  the  dynamics  is  not  simply  a  fractal 
noise.  Clearly,  there  may  be  severe  difficulties  in 
estimating  the  derivatives  of  measured  signals. 

Another  possibility  (similar  in  spirit  to  signal 
differentiation)  is  to  consider  a  generic  dif- 
feomorphism  z  =  F{x}  of  the  observed  variable. 
For  a  sufficiently  long  series  from  a  deterministic 
system,  no  difference  should  be  observed  be¬ 
tween  the  correlation  dimension  of  the  signal 
x{t)  and  that  of  z(r),  apart  from  the  effects  due 
to  the  amplification  of  measurement  noise.  In 
contrast,  the  dimension  of  a  stochastic  fractal 
signal  should  drastically  change  under  this  oper¬ 
ation,  since  the  characteristics  of  the  process  will 
be  modified  [34].  Changes  in  the  correlation 
integrals  may  depend  crucially  upon  F{jt};  a 
careful  examination  of  different  classes  of  trans¬ 
formations  must  be  pursued.  We  also  note  that 
transformation  of  the  distribution  (effectively,  a 
change  in  the  measurement  function)  has  been 
applied  by  Theiler  et  al.  [26]. 

3.4.  Independent  realizations 

A  useful  test  is  based  on  considering  several 
independent  realizations  of  the  dynamics,  with 
different  initial  conditions.  In  the  case  of  a  low¬ 
dimensional  dissipative  dynamics,  the  correlation 
dimension  of  the  set  of  points  obtained  by  con¬ 
sidering  all  realizations  at  once  is  equal  to  that  of 
a  single  realization,  provided  that  the  different 
realizations  start  in  the  same  basin  of  attraction. 
On  the  other  hand,  for  a  stochastic  system  the 
different  realizations  tend  to  fill  the  entire  space, 
and  one  should  observe  an  increase  of  the  corre¬ 
lation  dimension  estimate  with  the  number  of 


realizations  considered.  In  fact,  the  convergence 
of  the  dimension  estimates  for  the  noises  consid¬ 
ered  here  are  due  to  the  existence  of  long  time 
correlations,  the  effect  of  which  is  diminished  by 
considering  independent  realizations. 

To  illustrate  this  behavior,  fig.  11a  contrasts 
the  correlation  exponent  versus  the  embedding 
dimension  for  a  set  obtained  by  combining  five 
independent  time  series  of  the  x  component  of 
the  Lorenz  attractor  against  the  results  for  a 
single  time  series  of  the  same  total  length.  Fig. 
11b  reports  the  correlation  exponent  versus  the 
embedding  dimension  for  a  set  of  points  ob¬ 
tained  by  composing  five  independent  realiza¬ 
tions  of  the  non-linear  stochastic  process  (2.9), 
together  with  the  correlation  exponent  obtained 


Fig.  1 1 .  Correlation  exponent  versus  the  embedding  dimen¬ 
sion  for  the  Lorenz  attractor  (a)  and  for  multiple  realizations 
of  the  non-linear  stochastic  process  (2.9)  (b).  Crosses  are  for 
a  single  realization,  circles  refer  to  the  results  obtained  by 
superposing  five  independent  realizations. 
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from  the  analysis  of  a  single  realization.  The 
growth  of  the  correlation  dimension  estimate 
with  the  number  of  realizations  is  clearly  visible 
in  this  case.  Another  approach  in  this  vein  is  to 
consider  simultaneous  measurement  of  several 
quantities.  When  the  quantities  measured  reflect 
different  aspects  of  the  process,  then  the  infor¬ 
mation  content  in  two  simultaneous  signals  of 
length  T  can  be  much  greater  than  gaining 
“twice  as  many  points”  in  either  one  of  the 
signals  by  doubling  the  sampling  rate  (due  in  part 
to  projection  effects). 

3.5.  Structure  function 

A  classic  quantity  in  the  study  of  measured 
time  series  is  the  structure  function  (SF),  which 
is  given  by 

N-n 

S(/I)=  S  ,  (3.1) 

1  =  1 


where  x{t)  is  a  scalar  signal.  For  a  fractal  signal, 
the  structure  function  has  a  scaling  behavior 

S(/i)3cm-"  (3.2) 

at  small  values  of  n,  where  H  is  called  the  scaling 
exponent  (see  e.g.  refs.  [47.57,58]).  A  fractal 
signal  whose  SF  is  given  by  formula  (3.2)  has  a 
power-law  power  spectrum  P{to)~u)  where 
a  =  2H  +  1  [47].  By  composing  independent 
realizations  x*(/)  of  a  fractal  signal  on  the  differ¬ 
ent  axes  of  an  /V-dimensional  space,  one  obtains 
a  fractal  trajectory  which  is  parametrically  repre¬ 
sented  by  the  set  of  x*(r).  The  correlation  di¬ 
mension  u  of  the  trajectory  is  related  to  the 
scaling  exponent  by  the  expression  u  =  1 IH,  if 
v^M  [47,57,58],  The  different  signals  x^{t) 
may  be  independent  realizations  or  time-delayed 
versions  of  the  same  signal. 

The  structure  function  of  the  stochastic  pro¬ 
cesses  (2.8)  and  (2.9)  displays  a  scaling  behavior 


Fig.  12.  (a)  Structure  function  (SF)  for  the  x  component  of  the  Lorenz  attractor  shown  in  hg.  8a.  (b)  SF  for  the  non-linear 
stochastic  process  shown  in  hg.  lb.  (c)  SF  for  the  first  difference  deterministic  signal  shown  in  hg.  9a.  (d)  SF  for  the  first  difference 
stochastic  signal  shown  in  fig.  10a. 
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of  the  kind  (3.2)  (at  least  on  the  time  scales 
corresponding  to  the  power-law  regime  in  the 
spectrum).  Alternatively,  for  motion  on  a 
strange  attractor,  which  is  differentiable  in  the 
direction  of  motion  (and  whose  fractal  structure 
is  due  to  “close  returns”  in  phase  space),  the  SF 
has  a  scaling  exponent  H  =  I  at  small  values  of  n. 
The  SF  tends  to  oscillate  and  then  to  become 
constant  at  increasing  n,  due  to  the  limited  re¬ 
gion  of  the  phase  space  visited  by  the  system.  As 
an  example  of  the  above  statements,  in  fig.  12  we 
report  the  SF  versus  n  for  the  x  component  of 
the  Lorenz  attractor  (a)  and  for  the  non-linear 
noise  (2.9)  (b).  The  difference  between  the  two 
structure  functions  is  striking;  the  SF  for  the 
non-linear  noise  shows  an  extended  scaling  re¬ 
gime  while  the  SF  for  the  Lorenz  attractor  dis¬ 
plays  the  typical  behavior  discussed  above. 

By  combining  the  test  of  the  signal  differentia¬ 
tion  and  the  SF  calculation  one  obtains  an  even 
more  clear  difference  between  chaos  and  fractal 
noise.  Fig.  12c  shows  the  SF  versus  n  for  the  first 
difference  signal  obtained  from  the  x  component 
of  the  Lorenz  attractor  and  fig.  12d  reports  the 
SF  for  the  first  numerical  derivative  of  the  non¬ 
linear  process  (2.9).  The  SF  of  the  differentiated 
signal  from  the  Lorenz  attractor  is  practically 
equal  to  that  of  the  original  time  series,  while  the 
SF  of  the  noise  is  now  flat,  indicating  //  =  0  and 
a  non-convergent  dimension  estimate  for  the  first 
differenced  signal. 

4.  Discussion  and  conclusions 

The  (re)discovery  of  low-dimensional  deter¬ 
ministic  chaos  and  the  development  of  data  anal¬ 
ysis  methods  which  can  be  easily  implemented 
have  stimulated  many  works  devoted  to  the 
study  of  experimental  signals  from  the  “chaotic” 
viewpoint.  This  work  has  often  focused  on  decid¬ 
ing  whether  apparently  unpredictable  behavior 
should  be  ascribed  to  the  presence  of  a  low¬ 
dimensional  strange  attractor  rather  than  “ran¬ 
dom”  behavior.  In  many  cases,  however,  the 


desire  for  finding  a  chaotic  attractor  has  led  to  a 
naive  application  of  the  analysis  methods;  as  a 
result,  the  number  of  claims  on  the  presence  of 
strange  attractors  in  vastly  different  physical, 
chemical,  biological  and  astronomical  systems 
has  grown  (exponentially?).  Difficulties  in  inter¬ 
preting  correlation  integral  results  led  for  exam¬ 
ple  Grassberger  et  al.  [15]  to  state  that  ".  .  .  most 
(if  not  all)  of  these  claims  have  to  be  taken  with 
very  much  caution”.  Analogously,  in  a  more 
specific  context,  Lorenz  [40],  based  on  the  analy¬ 
sis  of  a  dynamical  system  with  several  weakly 
coupled  degrees  of  freedom,  has  recently  con¬ 
cluded  that  there  is  “no  reason  to  believe  that  an 
extensive  weather  or  climate  system  possesses  a 
low-dimensional  attractor”. 

The  most  convincing  evidence  for  low-dimen¬ 
sional  chaos  most  commonly  arises  when  the 
spatial  complexity  of  the  system  is  limited.  Ex¬ 
amples  include  carefully  controlled  laboratory 
experiments,  transitional  regimes  (for  example 
from  laminar  to  turbulent  flows)  and  some  natur¬ 
al  systems  where  physical  reasons  clearly  imply 
the  presence  of  only  a  few  active  collective 
modes  [39],  Extended  systems  (e.g.  fluids)  may 
require  long  (global)  space  correlations  for  a 
low-dimensional  dynamics  to  exist.  Systems  with 
short  space-correlations,  as  well  as  systems  with 
weakly  coupled  phase  space  variables,  need  not 
be  (globally)  described  by  low-dimensional 
dynamics.  For  the  latter  systems,  the  standard 
correlation  integral  approach  may  (again  in 
Lorenz’s  words)  “attempt  to  measure  the  dimen¬ 
sion  of  a  subsystem”  [40], 

In  this  paper  we  have  extended  the  results 
given  in  refs.  [47, 48]  and  we  have  considered 
several  different  types  of  random  noises  which 
can  result  in  convergent  estimates  of  the  dimen¬ 
sion  and  of  the  entropy.  In  particular,  we  have 
considered  two  types  of  stationary  stochastic  pro¬ 
cesses,  generated  by  linear  and  by  non-linear 
stochastic  processes.  It  has  been  shown  that  both 
noises  provide  a  very  similar  output  of  the  di¬ 
mension  and  entropy  (numerical!)  estimates. 
These  results  are  of  some  interest  since  they 
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remove  the  objection  that  only  non-stationary 
noises  are  associated  with  convergent  dimension 
and  entropy  estimates.  In  the  case  of  stochastic 
signals,  the  convergence  of  dimension  estimates 
are  a  result  of  the  fractal  nature  of  the  trajec¬ 
tories.  As  it  is  crucial  to  distinguish  which  aspect 
of  the  signal  the  estimate  is  reflecting,  we  have 
examined  this  question  direetly  through  space¬ 
time-separation  plots  and  structure  functions. 

Given  the  interest  of  distinguishing  between 
low-dimensional  chaos  and  random  behavior  in 
observed  signals,  we  have  considered  a  series  of 
tests  which  can  assist  (or  circumvent)  the  inter¬ 
pretation  of  the  correlation  integral.  These  tests 
employ  appropriate  surrogate  data  which  must 
then  be  analysed  by  the  same  methods  employed 
with  the  original  time  series  to  determine 
whether  the  estimates  of  the  dimension,  entropy 
or  any  other  statistic  depend  on  the  characteris¬ 
tics  of  the  time  series  which  have  been  modified. 
In  particular,  we  have  considered  the  procedures 
based  on  randomizing  the  Fourier  phases  of  the 
signal,  numerically  differentiating  the  original 
time  series,  and  the  analysis  of  several  indepen¬ 
dent  realizations  of  the  same  dynamics.  We  have 
also  discussed  how  the  structure  function  can  be 
used  for  contrasting  low-dimensional  chaos  and 
fractal  noises.  In  general,  we  have  shown  that 
low-dimensional  dynamics  may  be  distinguished 
from  fractal  noises  by  using  these  tests.  The  case 
of  randomly  modulated  periodic  (or  quasi- 
periodic)  oscillations  could  be  much  more  com¬ 
plicated,  and  a  clear  distinction  between  chaos 
and  random  modulations  might  best  employ 
other  techniques  (see  e.g.  ref.  [56]). 

In  conclusion,  we  stress  that  there  is  no  simple 
test  which  automatically  and  unequivocally  indi¬ 
cates  the  presence  or  the  absence  of  chaotic 
dynamics;  it  is  only  through  the  comparison  of 
several  different  methods  that  the  dynamical 
processes  underlying  a  given  system  may  be  as¬ 
sessed.  As  always,  a  minimal  physical  insight 
into  the  dynamics  of  the  system  under  study  is  a 
great  asset.  In  this  regard,  we  think  it  would  be 
extremely  useful  to  produce  a  collection  of  sig¬ 


nals  (both  deterministic  and  random)  and  pro¬ 
vide  a  detailed  description  of  the  output  of  the 
various  analysis  techniques  when  applied  to  each 
of  them.  In  this  way,  safer  conclusions  on  the 
presence  of  chaos,  low-dimensional  dynamics 
and/or  noise  from  the  analysis  of  measured  time 
series  could  be  obtained. 
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This  contribution  focuses  upon  extracting  information  from  dynamic  reconstructions  of  experimental  time  series 
data.  In  addition  to  the  problem  of  distinguishing  between  deterministic  dynamics  and  stochastic  dynamics,  applied 
questions,  such  as  the  detection  of  parametric  drift,  are  addressed.  Nonlinear  prediction  and  dimension  algorithms  are 
applied  to  geophysical  laboratory  data,  and  the  significance  of  these  results  is  established  by  comparison  with  results 
from  similar  surrogate  series,  generated  so  as  not  to  contain  the  property  of  interest.  A  global  nonlinear  predictor  is 
introduced  which  attempts  to  correct  systematic  bias  due  to  the  inhomogeneous  distribution  of  data  common  in  strange 
attractors.  Variations  in  the  quality  of  predictions  with  location  in  phase  space  are  examined  in  order  to  estimate  the 
uncertainty  in  a  forecast  at  the  time  it  is  made.  Finally,  the  application  of  these  methods  to  truly  stochastic  systems 
is  discussed  and  the  distinction  between  deterministic,  stochastic,  and  low  dimensional  dynamics  is  considered. 


1.  Introdoctioii 

It  is  now  generally  recognized  that  complex 
dynamical  behavior  is  not  restricted  to  systems 
with  many  active  degrees  of  freedom,  and  exam¬ 
ples  of  low  dimensional  nonlinear  systems  with 
complex  and  apparently  unpredictable  behavior 
are  commonly  cited  [1-S].  One  imagines  that 
Laplace  would  have  no  difficulty  with  chaos,  for 
given  the  exact  state  of  the  universe,  prediction 
of  a  modem  chaotic  future  is  no  more  difficult 
than  Newton’s  laws.  In  Laplace’s  words,  for  an 
intellect  ‘’vast  enough  to  submit  this  information 
to  Analysis, . .  .nothing  would  remain  uncertain, 
and  the  future,  as  well  as  the  past,  would  lay  be¬ 
fore  its  eyes”  [6].  Yet  few  would  dispute  that 
there  exist  many  systems  which  are,  in  fact,  not 
deterministic  within  any  known  physical  frame¬ 
work.  A  recurring  theme  in  time  series  analy- 
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sis  is  the  attempt  to  characterize  these  two  types 
of  systems  in  order  to  distinguish  determinism 
and  indeterminism,  chaos  and  stochasticity.  In 
many  cases  of  interest,  this  cannot  be  resolved 
definitely  and  we  shall  focus  on  a  simpler  ques¬ 
tion:  given  a  particular  set  of  observations  (and 
current  technological  constraints),  can  we  detect 
low  dimensional  dynamics  in  a  time  series? 

This  question  will  be  directed  either  at  phe¬ 
nomena  outside  the  lab  or  at  experiments  de¬ 
signed  to  investigate  such  phenomena.  There  is 
no  question  that  numerical  experiments  have 
taught  us  much  about  the  nature  of  chaos.  A 
more  interesting  question  now  appears  to  be 
what  chaos  can  teach  us  about  Nature.  The 
particular  experiment  discussed  here  investi¬ 
gates  dynamical  processes  related  to  the  motion 
of  planetary  atmospheres,  and  provides  an  in¬ 
stance  of  the  reoccurring  attempt  to  distinguish 
“red”  noise  from  nonlinear  determinism.  This 
distinction  is  central  in  determining  the  direc¬ 
tion  of  future  research;  the  best  model  for  an 
atmosphere  which  amplifies  small  scale  stochas- 
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tic  disturbances  differs  from  an  optimal  model 
of  the  complex  deterministic  interaction  of  a 
reasonable  number  of  modes.  Long  term  predic¬ 
tion  of  the  first  is  possible  only  in  a  statistical 
sense,  the  present  state  does  not  define  the  fu¬ 
ture  state  after  some  (nonlinear)  decorrelation 
time.  Prediction  in  the  second  case  is  difficult, 
but  possible  in  principle  as  the  information 
needed  is  contained  in  the  state  of  the  system. 
In  this  paper,  we  discuss  techniques  for  this  test 
and  report  initial  results  indicating  that  there  is 
low  dimensional  behavior  in  some  systems  (in¬ 
cluding  the  experiment)  and  not  in  others.  The 
analysis  of  such  systems  is  notoriously  difficult; 
what  we  desire  are  tests  for  low  dimensional  be¬ 
havior  which  are  reliable,  in  the  sense  that  they 
do  not  yield  false  positives. 

There  are  now  a  large  number  of  algorithms 
for  detecting  and  quantifying  low  dimensional 
behavior  and  chaos.  The  known  weaknesses  of 
individual  tests  may  be  addressed  through  the 
analysis  of  surrogate  data:  non-deterministic 
time  series  constructed  to  be  similar  in  appear¬ 
ance  to  the  original  data.  Such  an  analysis  aims 
to  establish  what  aspect  of  the  data  set  an  al¬ 
gorithm  is  quantifying,  by  determining  whether 
original  data  can  be  reliably  distinguished  from 
an  ensemble  of  surrogates,  when  each  data  set  is 
processed  in  precisely  the  same  way.  The  con¬ 
struction  of  these  surrogate  data  sets  is  discussed 
in  section  2  (also  see  [7-9]).  The  usefulness 
of  this  test  will  depend  on  the  quality  of  the 
generator  of  the  surrogate  sets,  as  a  poor  choice 
of  surrogate  generator  will  result  in  sets  which 
are  distinguished,  not  because  of  any  underlying 
determinism,  but  by  some  other  factor.  Indeed 
we  argue  in  section  6  that  some  stochastic  se¬ 
ries  may  be  more  predictable  than  surrogates 
generated  with  similar  statistics. 

Section  3  addresses  the  construction  and  eval¬ 
uation  of  dynamic  reconstructions  from  obser¬ 
vational  data,  where  the  vector  field  is' approx¬ 
imated  in  phase  space.  The  desirability  of  this 
approach  has  been  discussed  previously  (e.g. 
[  10,1 1  ] ).  We  note  that  by  using  additional  in¬ 


formation  about  the  macroscopic  slate  of  the 
system,  more  useful  dynamical  reconstructions 
can  be  obtained  from  the  same  data  set(s). 
Once  a  dynamic  reconstruction  is  in  hand,  a 
variety  of  other  questions  may  be  asked.  The 
existence  of  a  good  reconstruction  provides  an 
estimate  of  a  minimal  embedding  dimension  for 
the  system;  clearly  if  one  has  a  six-dimensional 
flow  by  which  the  observed  dynamics  are  well 
described,  then  a  six-dimensional  embedding 
is  a  practical  one.  In  addition,  many  character¬ 
istics  of  the  system  can  be  estimated  from  the 
reconstructed  flow  much  more  easily  than  from 
the  raw  data  directly,  for  example  the  spectrum 
of  unstable  periodic  orbits,  or  Lyapunov  expo¬ 
nents.  As  these  quantities  are  well  defined  for 
a  given  reconstruction,  one  must  address  the 
question  of  how  quickly  the  properties  of  the 
flow  approach  those  of  the  underlying  system. 
In  the  case  of  unstable  periodic  orbits,  this  can 
be  very  rapid  [12]. 

Dynamic  reconstructions  can  also  clarify  ex¬ 
perimental  uncertainties.  In  section  4,  we  ana¬ 
lyze  time  series  from  a  thermally  stressed  rotat¬ 
ing  fluid  annulus  [13].  Comparison  with  surro¬ 
gate  signals  demonstrates  that  the  organization 
in  the  reconstructed  phase  space  dynamics  is 
greater  than  that  arising  from  either  autocorre¬ 
lation  or  simple  advection  alone.  We  also  show 
how  dynamic  reconstructions  offer  a  natural 
method  for  the  detection  of  slow  parametric 
drift.  In  addition,  one  may  use  the  flow  to  make 
predictions  providing  a  direct  test  of  determin¬ 
ism.  We  stress  the  importance  of  what  is  pre¬ 
dicted  and,  in  general,  of  which  aspect  of  the 
time  series  is  reconstructed.  The  quality  of  pre¬ 
dictions  (the  difference  between  the  predicted 
and  observed  values)  may  vary  with  time  due 
to  differences  in  the  volatility  of  different  states 
of  the  system,  variations  in  the  quality  of  the 
predictor,  or  errors  in  observation.  We  demon¬ 
strate  a  method  of  estimating  the  expected  un¬ 
certainty  in  a  given  prediction,  and  discuss  how 
to  distinguish  between  these  various  causes.  Ad¬ 
ditional  evidence  that  the  time  series  are  in  fact 
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low  dimensional  is  given  in  section  5,  where 
we  apply  the  Grassberger-Procaccia  Algorithm 
(GPA)  [14]  to  analyze  the  geometric  structure 
of  the  experimental  data  sets  and  establish  the 
significance  of  the  results  through  comparison 
with  those  from  surrogate  data  sets. 

Finally,  in  section  6,  the  results  of  applying 
these  prediction  techniques  to  stochastic  series 
is  considered.  Laplacian  determinism  requires 
that,  in  the  limit  of  perfect  initial  data,  the  fu¬ 
ture  of  the  system  is  uniquely  defined,  so  the 
systems  considered  in  this  section  are  not  deter¬ 
ministic  in  this  sense.  There  is,  however,  either 
a  low  dimensional  or  a  deterministic  component 
in  their  evolution,  due  to  which  many  station¬ 
ary  stochastic  systems  will  appear  determinis¬ 
tic  relative  to  some  surrogate  series.  We  discuss 
some  of  the  implications  this  holds  for  distin¬ 
guishing  between  low  dimensional  determinism 
and  stochasticity  from  time  series. 


2.  The  data  sets:  observed  and  manufactured 

2.1.  Experiments 

The  primary  data  sets  considered  in  this  paper 
come  from  the  geophysically  inspired  experi¬ 
ments  reported  by  Read  et  al.  [13,15].  These  ex¬ 
periments  were  performed  within  a  fluid  filled, 
rotating  annulus  with  thermally  conducting  side 
walls  and  insulating  boundaries  top  and  bot¬ 
tom.  A  temperature  difference  was  maintained 
between  the  inner  (cooler)  and  outer  side  walls 
providing  an  infinite  dimensional  simulation 
of  the  mid-latitude  circulation  of  the  Earth’s 
atmosphere.  The  temperature  in  the  fluid  was 
measured  by  an  array  of  32  thermocouples, 
uniformly  distributed  in  azimuth  at  mid-height 
and  mid-radius.  By  monitoring  the  flow  rate 
(volume)  and  temperature  of  the  coolant  wa¬ 
ter,  simultaneous  measurements  of  the  total 
heat  transport  through  the  inner  boundary  were 
obtained. 


Of  the  many  reported  results,  two  realizations 
are  considered  here.  They  correspond  to  the  tem¬ 
perature  series  b  and  d  of  table  1  of  ref.  [13]  and 
are  shown  in  figs,  la  and  Ic.  The  heat  flux  dif¬ 
fers  from  the  temperature  series  as  it  is  averaged 
around  the  entire  annulus  and  thus  does  not  dis¬ 
play  the  roughly  periodic  structure  seen  in  the  lo¬ 
cal  temperature  probe;  this  structure  is  due  to  the 
advection  of  (an  evolving)  wave  pattern  around 
the  annulus.  Fourier  spectra  of  these  series  are 
given  by  Read  et  al.  [13].  Both  Read  et  al.  [13] 
and  R.  Smith  [16]  conclude  that  these  time  se¬ 
ries  are  low  dimensional;  series  b  coming  from 
a  strange  attractor  with  a  correlation  dimension, 
di  «  3,  while  series  d  with  d2  «  2,  is  considered 
to  reflect  a  two-torus  [13,15]. 

The  isolation  of  an  experiment  from  exter¬ 
nal  forces  is  a  major  concern  of  experimental¬ 
ists.  To  obtain  the  long  time  series  for  the  ro¬ 
tating  annulus,  experimental  runs  of  20  hours 
were  required.  One  may  ask  whether  the  envi¬ 
ronment  has  been  sufficiently  isolated,  for  exam¬ 
ple  from  diurnal  temperature  variations,  so  that 
no  systematic  parameter  drift  has  occurred  dur¬ 
ing  the  experiment.  Might  not  an  evolving  three- 
torus  present  a  geometric  structure  with  proper¬ 
ties  similar  to  those  observed?  A  method  of  de¬ 
tecting  slow  parametric  drift  with  dynamic  re¬ 
constructions  is  introduced  in  the  next  section. 
First  we  discuss  the  construction  and  use  of  sur¬ 
rogate  data  sets. 

2.2.  Surrogate  data  sets 

A  common  objection  to  the  dynamical  systems 
analysis  of  data  from  poorly  understood  systems 
is  that  the  significance  of  a  given  result  is  rarely 
established  [17-21].  This  objection  can  be  ad¬ 
dressed  directly  by  considering  a  class  of  non- 
deterministic  surrogate  signals.  The  significance 
of  a  result  is  then  established  by  comparing  it 
with  the  outcome  of  the  same  test  applied  to 
these  surrogate  data  sets. 

The  choice  of  surrogate  signals  will  also  de¬ 
pend  on  the  known  weaknesses  of  the  algorithm 
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employed  to  analyze  the  data.  While  any  com¬ 
parison  with  surrogate  signals  can  quantify  dif¬ 
ferences,  only  careful  choice  of  the  surrogate  gen¬ 
erator  allows  qualitatively  new  information.  The 
goal  is  to  generate  signals  with  similar  properties 
with  respect  to  this  weakness,  but  not  containing 
“the  physics”  of  the  original  signal.  A  result  is  sig¬ 
nificant  with  respect  to  the  weakness  tested  if  the 
algorithm  can  distinguish  the  true  signal  from 
the  surrogates.  In  cases  where  a  quantitative  re¬ 
sult  (or  distribution)  is  produced  for  both  ob¬ 
served  and  surrogate  signals,  one  may  estimate 
the  probability  that  observed  value  would  occur 
by  chance  in  an  ensemble  of  surrogate  realiza¬ 
tions,  and  thereby  evaluate  the  null  hypothesis 
that  the  observed  signal  was  a  realization  of  the 
surrogate  generator.  Note  that  the  second  formu¬ 
lation  is  more  difficult  than  the  qualitative  com¬ 
parison  of  results,  in  that  it  requires  the  algo¬ 
rithm  to  converge  for  the  surrogate  signals.  This 
poses  a  problem  for  dimension  estimates  via  the 
GPA*‘. 

Hypothesis  testing  and  model  evaluation  with 
surrogate  data  has  a  long  history  in  statistics 
[22,23].  There  are  often  two  conflicting  moti¬ 
vations  in  choosing  surrogates:  ease  of  analysis 
and  similarity  to  the  observed  data.  One  of  the 
most  common  surrogate  series  is  independent, 
uniformly  distributed  white  noise  which  has  the 
advantage  that  the  expected  distributions  can 
often  be  calculated  analytically.  Most  time  se¬ 
ries,  however,  are  not  uniformly  distributed,  nei¬ 
ther  are  consecutive  observations  independent; 
at  reasonable  sampling  rates  the  series  should 
display  some  structure.  A  simple  IID  surrogate 
generator  provides  the  correct  distribution  by 
simply  shuffling  the  data,  while  series  longer 

In  their  discussion  of  the  method  of  surrogate  data, 
Theiler  et  al.  [7]  point  out  that,  strictly  speaking,  an 
algorithm  need  not  converge  for  the  null  hypothesis  to 
be  rejected.  While  this  is  true,  it  is  important  to  note 
the  difference  between  distinguishing  the  observed  series 
from  the  surrogates  and  estimating  a  statistic  (e.g.  a 
dimension)  from  the  observed  series  which  describes 
the  dynamics. 


than  the  observed  signal  may  be  obtained  by 
randomly  sampling  the  observed  distribution. 

Temporal  correlations  are  reflected  in  the 
Fourier  spectrum  of  the  series.  “White”  series 
have  a  flat  spectrum,  while  those  with  less  power 
at  high  frequencies  are  called  “red”.  The  meth¬ 
ods  of  the  previous  paragraph  destroy  temporal 
correlations  in  the  original  data.  Perhaps  the 
simplest  method  to  amend  this  is  to  produce  the 
time  series  where  the  next  observation  is  cho¬ 
sen  from  a  distribution  determined  by  the  cur¬ 
rent  state.  For  digital  data,  the  amount  of  data 
required  to  estimate  this  conditional  distribu¬ 
tion  will  depend  on  the  resolution  of  the  analog 
to  digital  conversion.  In  this  paper  we  will  be 
concerned  primarily  with  yet  another  surrogate 
generator  which  preserves  the  autocorrelation 
function  of  the  original  data. 

The  point  here  is  to  show  that  there  are  a  va¬ 
riety  of  methods  available  for  constructing  sur¬ 
rogate  series  and  note  that  signals  may  be  indis¬ 
tinguishable  from  one  set,  and  not  another.  The 
insight  gained  from  testing  surrogate  signals  de¬ 
pends  on  the  particular  surrogate  generator (s) 
adopted. 

When  searching  for  low  dimensional  deter¬ 
ministic  dynamics,  an  alternate  approach  for 
generating  surrogates  is  to  use  a  simple  stochas¬ 
tic  model  of  the  system.  While  care  must  be 
taken  not  to  overfit  the  model  to  the  data  (e.g. 
to  construct  epicycles),  this  approach  may  be 
particularly  useful  in  evaluating  dimension  cal¬ 
culations  from  short,  highly  structured  series. 
Used  by  Grassberger  when  considering  climate 
data  [18,20],  this  approach  is  discussed  below 
for  the  sunspot  data. 

Both  predictability  and  correlation  dimension 
estimates  may  be  biased  through  autocorrelated 
signals.  To  test  if  a  given  result  is  significant  with 
respect  to  signals  with  the  same  autocorrelation 
function  the  following  surrogate  generator  (sug¬ 
gested  by  Osborne  et  al.  [24])  may  be  used  : 
first  compute  the  Fourier  transform  of  the  origi¬ 
nal  signal,  then  compute  a  set  of  random  phases, 
and  finally  invert  the  transform  using  the  orig- 
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inal  amplitudes  and  a  particular  set  of  random 
phases  to  generate  a  particular  surrogate  series. 
Since  both  the  original  and  surrogate  series  will 
have  the  same  Fourier  amplitude  spectrum,  their 
autocorrelation  functions  will  be  identical.  An 
ensemble  of  surrogate  signals  may  then  be  sub¬ 
jected  to  exactly  the  same  analysis  as  the  original 
data;  surrogates  for  the  annulus  data  generated 
in  this  way  are  shown  in  figs,  lb  and  Id.  Note 
that  we  are  concerned  with  relatively  long  se¬ 
ries  here,  where  many  linear  decorrelation  times 
are  available,  so  that  calculation  of  the  Fourier 
transform  is  not  a  problem.  Signals  from  this  FT 
surrogate  generator  will  be  used  to  demonstrate 
the  significance  of  the  dynamic  reconstructions 
in  section  4  and  correlation  integral  calculations 
in  section  5. 

Note  that  it  is  not  necessary  for  a  surrogate 
generator  to  completely  destroy  the  phase  co¬ 
herence  in  a  signal,  and  in  some  cases,  it  is  not 
desirable.  Consider  a  signal  containing  some 
understood  frequencies,  such  as  a  diurnal  cy¬ 
cle  in  a  temperature  record.  Complete  phase 
randomization  will  distort  the  daily  cycles  ob¬ 
vious  in  the  data  and  the  true  signal  may  be 
distinguished  from  the  surrogates  for  this  rea¬ 
son  alone.  A  stronger  result  is  obtained  (in  the 
sense  that  a  more  relevant  null  hypothesis  is 
rejected)  if  the  observed  signal  can  be  distin¬ 
guished  from  an  ensemble  of  surrogates  which 
also  contain  a  diurnal  cycle.  Such  surrogate  se¬ 
ries  may  be  generated  by  retaining  a  subset  of 
the  Fourier  phases  unaltered  and  randomizing 
the  remainder.  (Note  that  the  fine  structure  of 
a  fractal  attractor  will  be  destroyed  simply  by 
randomizing  only  the  phases  corresponding  ei¬ 
ther  to  high  frequencies  or  to  those  frequencies 
with  relatively  low  power  thus  retaining  some 
macroscopic  structure. ) 

For  some  data  sets,  reconstructions  cannot  be 
distinguished  from  those  of  the  surrogate  gener¬ 
ators  discussed  above;  but  for  many  interesting 
systems  the  construction  of  good  surrogates  will 
require  a  more  detailed  examination  of  the  sys¬ 
tem.  The  underlying  desire  is  often  not  to  iden¬ 


tify  either  nonlinearity  or  chaos,  but  low  dimen¬ 
sional,  deterministic  dynamics.  This  may  be 
pursued  by  employing  the  best  stochastic  model 
available  for  a  given  process  as  a  source  of  sur¬ 
rogate  signals.  One  must  balance  overfitting  the 
model  to  the  data  (allowing  unduly  complex 
models)  against  setting  up  “straw  man”  surro¬ 
gates.  An  excellent  example  is  provided  by  the 
annual  mean  sunspot  numbers.  The  basic  asym¬ 
metries  of  the  sunspot  number  (it  is  strictly 
positive,  increases  more  rapidly  than  it  decays, 
etc. )  and  the  presence  of  events  like  the  Maun¬ 
der  minimum  [25]  make  the  simple  FT  sur¬ 
rogates  inappropriate  for  this  series.  It  is  non¬ 
linear  by  inspection.  Analysis  calls  for  either  a 
modification  of  the  data  set  (e.g.  Spiegel  and 
Wolf  [26] )  or  an  improved  surrogate  generator. 
By  modifying  the  dynamics  of  a  linear  ARMA 
model,  Barnes  et  al.  [27]  have  constructed  a 
nonlinear  stochastic  model  of  sunspot  number, 
which,  fortuitously,  produces  Maunder  minima. 
Treating  this  model  as  a  surrogate  generator,  the 
significance  of  correlation  integral  results  for 
the  sunspot  series  is  examined  by  Weiss  [28], 
along  with  a  discussion  of  solar  aperiodicity  in 
the  context  of  nonlinear  dynamical  systems. 

In  summary,  different  surrogates  will  test  dif¬ 
ferent  effects.  The  better  the  surrogate  genera¬ 
tor,  the  more  relevant  the  class  of  signals  that  the 
data  set  (and  by  implication  the  system)  can  be 
distinguished  from.  Even  then,  the  comparison 
with  the  best  surrogate  signals  provides  only  a 
necessary  condition  for  the  detection  of  low  di¬ 
mensional  dynamics;  it  is  not  sufficient.  As  we 
are  showing  what  the  signal  is  not,  this  approach 
cannot  establish  what  the  system  is;  in  this  sense 
proving  moderate  dimensional  chaotic  dynam¬ 
ics  by  this  method  is  similar  to  proving  true  ran¬ 
domness,  one  only  knows  when  one  cannot  do  it. 
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3.  Reconstructions 

3. 1.  Static  reconstructions 

The  methods  of  nonlinear  dynamical  systems 
theory  discussed  here  require  time  series  to  be 
reconstructed  in  a  geometrical  framework  [29]. 
Consider  a  single  signal  measured  as  a  function 
of  time,  s{t).  Once  the  signal  is  recorded  digitally 
in  discrete  time  we  have 

5/=5(iTs),  /■  =  l,2,...,/ls,  (3.1) 

where  t*  is  the  sampling  time  (and  5,  is  digitized 
to  one  of  a  finite  number  of  values). 

Consider  a  deterministic  system  with  phase 
space  dimension  M^.  A  trajectory,  4:(0,  of  this 
system  is  reconstructed  in  M  dimensions  from 
a  time  series  of  a  single  observable,  s(t),  by  the 
method  of  delays  [30,31  ]  to  yield 

x{t)  =  is(t),s{t-Xdi),...,s{t-  (M-l)Xd)), 

(3.2) 

where  x^  is  called  the  delay  time.  The  delay  time 
need  not  equal  r,  (although  it  must,  of  course, 
be  an  integer  multiple  of  Ts).  In  fact  the  M  -  1 
delays  used  in  defining  jc(t)  need  not  be  equal, 
although  they  will  be  treated  as  such  here.  Meth¬ 
ods  for  choosing  Tj  vary  (see  e.g.  [32-34]  ); 
it  is  usually  related  to  the  decay  of  information 
in  the  signal  with  time,  either  from  linear  auto¬ 
correlation  time  (Tauto)  or  more  general  meth¬ 
ods  [35].  When  constructing  nonlinear  predic¬ 
tors,  the  delay  may  be  chosen  to  optimize  the 
predictor  as  demonstrated  in  section  4.  Breeden 
and  Packard  [36]  discuss  the  case  of  time  series 
sampled  nonuniformly  in  time. 

The  arguments  which  follow  do  not  depend 
on  the  use  of  this  method  of  delays.  We  have 
achieved  similar  results  with  singular  value 
decomposition  (SVD)  reconstructions  (see 
[37,38,35] ).  Multi-variate  series  also  work  well, 
often  with  significantly  shorter  time  series  in 
terms  of  the  total  duration  of  the  “experiment”. 


This  is  easily  understood  as  multivariate  probes 
can  distinguish  states  in  phase  space  which  ap¬ 
pear  similar  to  univariate  probes  due  to  pro¬ 
jection  effects.  When  working  with  finite  data 
sets,  the  use  of  multi-probe  data  can  add  crucial 
information  on  the  state  of  the  system,  either  by 
directly  characterizing  macroscopic  patterns  or 
through  direct  (and  much  more  efficient)  eval¬ 
uation  of  mode  amplitudes.  We  return  to  this 
issue  in  section  7.  Typically,  each  series  is  trans¬ 
formed  to  have  zero  mean  and  unit  standard  de¬ 
viation,  however  the  standard  deviation  can  be 
varied  to  change  the  weighting  between  differ¬ 
ent  variables  in  the  interpolation  scheme.  When 
we  are  concerned  with  predicting  a  fixed  period 
in  the  future,  we  consider  a  third  time  scale,  Tp, 
the  prediction  time.  Each  point  j:(0  on  the  tra¬ 
jectory  has  a  scalar  image  5 (/  +  Tp)  and  we  wish 
to  construct  a  predictor  to  determine  this  image 
for  any  x.  In  other  applications  the  time  of  the 
prediction  is  determined  through  some  geomet¬ 
ric  constraint.  For  example,  when  working  on  a 
surface  of  section  the  time  of  the  next  crossing 
must  be  predicted  as  well  as  its  location.  Alter¬ 
natively,  when  predicting  recurrence  times,  the 
main  goal  of  the  analysis  is  to  determine  Tp. 

3.2.  Dynamic  reconstructions 

Recently  there  has  been  much  interest  in  pre¬ 
dicting  nonlinear  deterministic  systems  and  a 
wide  variety  of  approaches  and  algorithms  have 
been  proposed  (see  [39-43,33,44-48]).  While 
these  systems  differ  in  detail  they  all  attempt  the 
same  task,  since  in  the  context  of  determinis¬ 
tic  analysis,  prediction  in  time  becomes  a  ques¬ 
tion  of  interpolation  in  phase  space.  To  predict 
a  deterministic  system  given  a  description  of  its 
current  state,  one  is  faced  with  the  basic  prob¬ 
lem  of  interpolating  the  future  behavior  based 
on  a  sample  of  the  “nearby”  points.  Like  all  in¬ 
terpolation  problems,  success  depends  on  hav¬ 
ing  a  sufficient  number  of  nearby  points  to  sat¬ 
isfy  the  smoothness  assumptions  of  the  chosen 
algorithm.  In  the  presence  of  noise,  this  require- 
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ment  is  increased  so  that  the  variations  due  to 
the  noise  may  be,  in  some  sense,  averaged  out. 
We  shall  use  a  global  radial  basis  function  predic¬ 
tor  [43  ]  and  account  for  noise  by  fitting  the  pre¬ 
dictor  to  tt'e  data  in  a  least  squares  sense  [41  ]. 
Further,  we  can  account  for  the  systematic  bias 
introduced  by  extreme  inhomogeneities  in  the 
distribution  by  adjusting  the  weighting  scheme. 
This  method  provides  a  smooth  flow  over  the 
entire  region  of  the  reconstruction  (which  may 
or  may  not  be  an  “attractor”).  As  we  are  inter¬ 
ested  in  finding  global  structures  (e.g.  periodic 
orbits),  this  smoothness,  lost  in  most  local  meth¬ 
ods,  is  important  (see,  however  [48] ). 

When  the  underlying  dynamics  are  not  known 
in  advance,  we  both  construct  and  evaluate  a 
dynamic  reconstruction  from  the  same  data  set. 
To  do  so,  the  data  set  is  typically  divided  into 
two  sections  of  unequal  length:  the  learning  set 
consisting  of  «l  points  from  which  a  reconstruc¬ 
tion  is  derived  and  the  test  set  on  which  various 
reconstructions  are  evaluated.  It  is  crucial  that 
this  distinction  should  be  maintained  for  out-of- 
sample  evaluation  of  the  predictor.  That  is  not 
to  say  that  statistics  from  the  ability  to  fit  the 
learning  set  are  not  of  interest,  but  that  these 
two  types  of  statistics  measure  essentially  differ¬ 
ent  things.  Statistics  generated  within  the  learn¬ 
ing  set  reflect  how  well  the  data  can  be  forced 
into  a  given  mold  and  may  be  useful,  for  exam¬ 
ple,  for  internal  consistency  checks  and  locating 
outliers.  Those  generated  from  the  test  set  re¬ 
flect  how  well  the  predictor  generalizes  from  the 
learning  set  to  new  data.  Only  the  latter  are  of 
use  for  cross-validation.  Predictor  “error”  in  the 
two  sets  is  a  very  different  quantity.  For  exam¬ 
ple,  with  the  exact  radial  basis  function  predic¬ 
tor  described  below,  the  in-sample  predictor  er¬ 
ror  can  be  made  zero  for  almost  any  data  set. 

The  predictor  is  based  upon  a  set  of  «c  centers 
in  an  ^/-dimensional  space: 

4C',  ;•  =  l,2,...,nc; 

The  choice  of  centers  will  be  discussed  below. 


but,  in  the  simplest  case,  each  center  might  cor¬ 
respond  to  a  data  point  in  the  learning  set.  As¬ 
sociated  with  each  of  the  Wl  points,  j,,  in  the 
learning  set  is  an  observation,  5,  ;  5,  may  be  a  fu¬ 
ture  value  of  the  system,  a  simultaneous  value 
of  another  state  variable,  or  even  a  past  obser¬ 
vation  thought  to  contain  noise  [49].  In  general, 
the  problem  is  to  construct  a  predictor  (or  map), 
F  (x ) :  — *  IR '  which  estimates  s  for  any  x .  We 

will  consider  F  (x )  of  the  form 

F(x)  =  f^A,(^(||x-x5||),  (3.3) 

7=1 

where  0(r)  are  radial  basis  functions  and  the  Xj 
are  constants  which  are  determined  by  observa¬ 
tions  in  the  learning  set: 

F(x,)«5,.  (3.4) 

Determining  the  Xj  corresponds  to  the  solu¬ 
tion  of  the  (linear)  problem 

b  =  M,  (3.5) 

where  A  is  a  vector  of  length  Wc  whose  jth  com¬ 
ponent  is  Xj  and  A  and  b  are  given  by 

Aij  =  C0i(f>(\\Xi-x'i\\)  (3.6) 

and 

bi  =  0)iSi,  (3.7) 

where  /  =  1 , . . . ,  ul  and  j  =  1 , . . . ,  nc-  Tradi¬ 
tionally,  the  weights  w,  reflect  the  varying  con¬ 
fidence  associated  with  the  ith  observation. 

Casdagli  [43]  was  the  first  to  solve  this  prob¬ 
lem  in  the  context  of  predicting  chaotic  systems, 
considering  the  special  case  of  exact  interpola¬ 
tion  where  centers  are  chosen  from  the  learn¬ 
ing  set  and  only  their  images  are  considered  in 
eq.  (3.4).  In  this  case  the  interpolation  on  the 
centers  is  exact;  the  matrix  A  is  square  and  the 
solution  for  >l  depends  on  the  A  being  nonsin¬ 
gular.  This  is  guaranteed  when  the  xJ  are  dis¬ 
tinct  and  0(r)  is  a  radial  basis  function  [50.51]. 
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Typical  radial  basis  functions  arc  (t>{r)  =  r,  r^, 
s/r^  +  c,  1  /v'r^  +  c,  and  e~'^  where  c  denotes 
a  constant  often  based  on  the  average  distance 
between  neighboring  centers. 

Casdagli  demonstrated  the  effectiveness  of 
this  approach  and  showed  that  interpolation  in 
both  time  and  parameter  space  was  possible. 
There  are,  however,  several  drawbacks  when  ap¬ 
plying  it  to  noisy  data;  in  this  form,  the  interpo¬ 
lation  fits  the  centers  exactly,  and  no  informa¬ 
tion  from  the  points  in  the  learning  set  not  cho¬ 
sen  as  centers,  is  used.  It  is  desirable  to  use  this 
information,  and  important  to  avoid  overfitting 
or  fitting  the  noise  in  data  exactly  (especially 
since  when  making  the  choice  of  centers,  one 
may  tend  to  select  outliers).  Even  with  numeri¬ 
cal  systems,  computational  constraints  limit  the 
number  of  centers  used. 

In  order  to  include  the  information  available 
from  the  learning  set,  Broomhead  and  Lowe 
[41  ]  solved  this  problem  in  a  least  squares  sense 
and  studied  the  behavior  of  the  logistic  map. 
For  the  least  squares  case,  the  entire  learning  set 
is  included  in  eq.  (3.5)  {b  is  of  length  m,  >  ««), 
but  a  smaller  number  of  centers  are  employed, 
and  thus  A  is  not  square.  Further  there  is  no  need 
to  know  the  images  of  the  centers,  so  they  need 
not  correspond  to  observations  from  the  series. 
We  seek  a  A  which  minimizes  =  l|l^  -  AA||^ 
Choosing  the  solution  which  also  minimizes 
||A|p  corresponds  to 

A  =  A+ft,  (3.8) 

where  A+  is  the  Moore-Penrose  pseudo-inverse 
of  A.  Efficient  methods  of  calculating  A  +  via  sin¬ 
gular  value  decomposition  are  discussed  in  [23  ] . 
Noting  that  the  guaranteed  solubility  of  the  orig¬ 
inal  system  is  lost  in  this  generalization.  Broom- 
head  and  Lowe  [41  ]  quantified  the  effect  of  in¬ 
creasing  the  number  of  centers  and  considered 
this  modeling  approach  as  a  special  case  of  a  neu¬ 
ral  network  with  a  guaranteed  learning  rule. 

We  continue  this  approach,  introducing  the 
weights  Wi,  and  investigate  the  sensitivity  of  the 


solution  to  the  choice  of  centers  and  the  effects  of 
observational  noise.  Given  the  inhomogeneous 
(often  singular)  distribution  of  data  on  a  chaotic 
attractor  (see  e.g.  [52] ),  these  weights  can  also 
be  used  to  provide  a  more  uniform  prediction 
error  across  the  attractor  by  reducing  the  impor¬ 
tance  of  the  dense  regions  of  the  reconstruction. 

One  common  objection  to  the  use  of  radial 
basis  function  interpolation  arises  from  the 
large  number  of  free  parameters  employed,  one 
per  center  in  the  original  formulation.  The  least 
squares  formulation  addresses  this  question  of 
parsimony.  When  determining  the  coefficients 
of  eq.  (3.5),  we  perform  a  SVD  of  the  matrix  A. 
In  doing  so,  a  tolerance  is  set  as  to  the  smallest 
meaningful  value  an  eigenvalue  can  take  [23]. 
Values  below  this  threshold  are  considered  su¬ 
perfluous  and  suppressed  (set  equal  to  zero). 
This  prevents  “extra”  degrees  of  freedom  in  the 
model  from  overfitting  “noise  fluctuations”.  As 
the  threshold  is  raised,  the  estimated  uncertainty 
in  the  modeling  parameters  (the  A,  )  decreases 
dramatically,  with  little  effect  on  the  or  the 
in-sample  predictor  error.  This  might  be  taken 
to  mean  that  higher  tolerances  were  preferred 
to  avoid  fitting  noise  in  the  learning  set.  Out-of- 
sample  prediction  error  statistics  often  conflict 
with  this  interpretation:  there  are  examples  for 
which,  although  the  estimated  uncertainty  of  the 
A,  is  greater,  low  threshold  models  consistently 
yield  better  out-of-sample  prediction  statistics. 
This  implies  that  the  model  is  not  fitting  noise 
in  the  learning  set.  It  is  here  that  a  difference 
between  radial  basis  functions  is  observed:  for  a 
fixed  tolerance  and  an  identical  choice  of  cen¬ 
ters,  models  with  <i>(r)  =  e“'’  consistently 
use  fewer  degrees  of  freedom  than  those  with 
(j>{r)  —  or  <p(r)  =  r. 

Predictions  more  than  one  sampling  time  into 
the  future  can  be  made  either  by  direct  forecasts. 
constructing  a  predictor  for  this  time  scale,  or 
through  iterative  forecasts  repeatedly  using  a 
predictor  which  forecasts  a  smaller  time  step. 
Farmer  and  Sidorowich  [49]  conclude  that  it¬ 
erative  forecasts  are  generally  better  than  direct 
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forecasts,  but  also  present  an  experimental  ex¬ 
ample  where  the  reverse  is  observed.  Stokbro 
[48]  also  compares  the  two  as  a  function  of  the 
forecast  time  and  comments  on  the  choice  for 
the  basic  time  step.  Direct  forecasts  are  used  in 
this  paper. 

As  noted  above,  minimization  of  \\b  -  AAjp 
with  all  (Oi  equal  results  in  a  bias  in  favor  of  the 
frequently  visited  regions  of  phase  space.  In  or¬ 
der  to  achieve  a  good  reproduction  throughout 
phase  space,  such  weighting  is  not  desirable,  as 
argued  in  the  next  section.  One  method  to  ac¬ 
count  for  this  is  to  partition  the  phase  space  and 
allow  only  a  specified  number  of  points  in  each 
partition.  In  the  presence  of  noise  it  is  preferable 
to  retain  all  the  observations  and  adjust  the  cu, 
so  that  the  partitions  are  more  equally  weighted. 

3.2. 1.  The  choice  of  reconstruction  centers 

We  now  consider  the  question  of  how  to  deter¬ 
mine  the  centers.  Four  approaches  to  this  ques¬ 
tion  are  to  choose  the  centers  either 

(i)  randomly  (or  uniformly)  in  the  region  of 
phase  space  explored  by  the  data, 

(ii)  with  respect  to  the  probability  density 
(measure)  on  the  reconstruction, 

(iii)  spatially  uniform  on  the  reconstruction, 

(iv)  with  respect  to  the  local  divergence  on  the 
reconstruction. 

Methods  (iii)  and  (iv)  appear  the  most  ro¬ 
bust  in  terms  of  providing  good  reconstructions 
with  a  limited  number  of  centers,  but  unfortu¬ 
nately,  these  results  appear  to  be  system  depen¬ 
dent.  There  are  several  shortcomings  in  methods 
(i)  and  (ii)  .  Placing  the  centers  uniformly  in 
space  works  well  when  the  system  explores  the 
entire  region,  as  with  the  logistic  map  in  one  di¬ 
mension.  In  higher  dimensional  spaces,  there  are 
often  large  lacunae  into  which  the  system  does 
not  venture;  placing  many  centers  in  such  gaps  is 
counter-productive,  at  least  when  localized  ba¬ 
sis  functions  are  employed  .  Numerical  exper- 

I  would  like  to  thank  James  Theiler  for  pointing  out 

this  qualification. 


iments  indicate  this  is  particularly  relevant  in 
cases  where  the  underlying  dynamics  is  not  de¬ 
termined  by  a  simple  analytic  formulation  (con¬ 
trast  a  true  surface  of  section  of  a  flow  with  that 
of  an  analytically  defined  map)  perhaps  due  to 
the  smoothness  of  the  dynamics. 

Centers  may  be  placed  uniformly  with  respect 
to  the  probability  density  on  the  reconstruction 
either  by  choosing  them  equally  spaced  in  time 
or  randomly  sampling  the  series.  This  initially 
attractive  idea  often  yields  poor  results.  One  rea¬ 
son  for  this  can  be  understood  in  the  case  of 
flows  where  the  speed  in  phase  space  varies  from 
point  to  point.  An  ideal  illustration  of  this  effect 
is  provided  by  the  Duffing  oscillator  near  homo¬ 
clinic  bifurcation  (see  [53,54]).  In  this  case,  a 
trajectory  spends  most  of  its  time  near  the  fixed 
point,  while  the  centers  “should”  be  distributed 
over  the  relatively  rare  excursions.  With  maps, 
inhomogeneities  in  the  measure  also  result  in  a 
poor  distribution  of  centers. 

One  method  to  distribute  the  centers  uni¬ 
formly  on  the  reconstruction  is  simply  to  dis¬ 
allow  centers  closer  than  some  nearest  center 
distance  dnc-  This  succeeds  in  the  Duffing  case 
and  will,  in  general,  avoid  the  accumulation  of 
centers  in  the  slow  moving  regions  of  the  re¬ 
construction  which  are  relatively  easy  to  pre¬ 
dict.  But  that  is  the  real  point.  Once  the  basic 
skeleton  of  the  reconstruction  is  covered,  it  is 
reasonable  to  place  additional  centers  in  regions 
where  the  fine  structure  of  the  flow  is  greatest 
and  where  prediction  is  most  difficult.  Note  that 
this  need  not  correspond  to  the  fine  structure 
of  the  probability  density  or  geometry,  the  fine 
structure  here  is  in  the  vector  field  of  the  phase 
space  flow,  not  that  of  the  attractor.  Also  note 
that,  while  the  location  of  additional  centers  al¬ 
lows  the  predictor  to  develop  fine  structure,  this 
will  not  occur  unless  the  data  are  weighted  to¬ 
ward  the  recovery  of  that  fine  structure.  Return¬ 
ing  to  the  Duffing  oscillator,  we  would  like  to 
combine  (iii)  and  (iv) ,  covering  the  excursions 
and  also  the  region  about  the  unstable  manifold 
near  the  origin  so  that  the  beginning  of  an  ex- 
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cursion  is  predicted.  Locations  where  the  flow 
is  contracting  need  not  be  sampled  so  densely. 

The  importance  of  these  effects  in  a  given  re¬ 
construction  may  be  determined  by  dividing  the 
reconstruction  space  into  partitions  and  exam¬ 
ining  the  errors  made  in  each  region.  When  the 
centers  are  distributed  uniformly  on  the  recon¬ 
struction,  a  straightforward  way  to  partition  the 
reconstruction  is  to  classify  each  point  according 
to  the  center  to  which  it  is  nearest.  This  is  now 
demonstrated  for  rotating  annulus  data. 

4.  Applications  to  laboratory  data 

We  now  apply  the  ideas  of  the  last  section  to 
the  rotating  annulus  data.  Consider  first  a  dy¬ 
namic  reconstruction  of  data  set  series  b  built 
from  a  learning  set  consisting  of  2K  points 
(IK  =  2*°)  from  the  first  16K  data  points  of 
this  SOAT  point  data  set.  The  reduction  from  1 6K 
to  2K  was  achieved  by  increasing  the  sampling 
time  by  a  factor  of  8,  thus  all  time  steps  consid- 
ereu  in  reconstruction  will  be  multiples  of  8  t*. 
A  t  jtal  of  /ic  =  1 28  centers  were  chosen  such 
that  no  two  were  closer  than  a  nearest  center 
distance,  d^;  this  was  implemented  by  choosing 
an  initial  value  of  t/nc  large  enough  so  that  less 
than  half  the  desired  number  of  centers  were 
found  on  the  first  pass  through  the  learning  set. 
Th'^  value  of  dac  was  then  decreased  by  a  factor 
of  0.7  and  the  process  repeated  iteratively  in 
ord  jr  to  avoid  over-sampling  any  one  segment 
of .  ie  learning  set.  In  this  case,  the  delay  time 
T<j  =  4  (8ts)  and  an  embedding  dimension, 
M  -  5,  were  also  chosen  taking  into  account 
the  results  of  the  correlation  integral  calcula¬ 
tion  ;  of  section  5.  We  shall  refrr  to  this  model 
with  <l>(r)  =  as  reconstruction  A. 

The  initial  results  are  presented  in  fig.  2.  The 
three  panels  show  (a)  the  observed  (solid)  and 
predicted  (symbol)  time  series  as  a  function  of 
time,  (b)  the  absolute  value  of  the  prediction  er¬ 
ror,  and  (c)  the  distance  between  the  point  being 
predicted  and  the  nearest  center  to  it,  dnc-  The 


prediction  time  (tp  =  18  (8ts))  was  chosen  as 
twice  Tauio  (the  first  zero  of  the  linear  autocor¬ 
relation  function).  Each  of  the  predictions  was 
made  at  this  fixed  distance  into  the  future,  the  se¬ 
ries  shown  is  taken  from  the  beginning  of  the  test 
set  and  represents  completely  out-of-sample  test¬ 
ing.  In  panel  2a,  the  prediction  time  is  just  over 
5  of  the  separation  of  the  tick  marks.  Comparing 
the  first  2  panels,  it  is  observed  that  large  errors 
often  correlate  with  extreme  values  of  the  mea¬ 
sured  signal.  Occasionally,  there  are  episodes  of 
poor  predictions  (not  shown)  which  do  not  cor¬ 
respond  to  extreme  values  of  the  signal  but  do 
correspond  to  large  values  of  d^c,  this  implies 
that  the  trajectory  is  located  in  a  region  of  phase 
space  not  explored  during  the  learning  set.  Pre¬ 
dictions  in  such  regions  are  extrapolations  and 
generally  not  reliable. 

The  effect  of  varying  reconstruction  parame¬ 
ters  and  choice  of  basis  function  is  illustrated  in 
fig.  3,  which  shows  the  cumulated  predictor  er¬ 
ror  profile,  P(e ),  for  three  different  reconstruc¬ 
tions.  These  graphs  display  the  fraction  of  the 
learning  set  predicted  to  within  a  given  error.  For 
example,  the  solid  line  denoting  the  reconstruc¬ 
tion  of  fig.  2  shows  that  half  the  learning  set  was 
predicted  with  log2 (error)  <  -3  corresponding 
to  an  accuracy  of  approximately  6  bits.  The  right 
most  (short-dashed  line)  corresponds  to  a  sim¬ 
ilar  reconstruction  with  Tj  =  8ts  (a  factor  of  4 
shorter  than  reconstruction  A).  The  correspond¬ 
ing  predictions  are  about  0.5  bits  worse;  more  so 
for  small  errors,  was  chosen  to  optimize  this 
distribution,  although  for  Tj  slightly  greater  than 
the  chosen  value  the  variation  was  small. 

The  long-dashed  line  in  fig.  3  shows  the  distri¬ 
bution  for  a  reconstruction  similar  to  reconstruc¬ 
tion  A,  but  using  0(r)  =  as  was  often  ob¬ 
served,  the  exponential  provided  a  slightly  bet¬ 
ter  fit.  Although  we  shall  not  discuss  the  effect 
of  different  basis  functions  further,  it  is  interest¬ 
ing  to  note  that  the  two  predictors  tend  to  yield 
similar  predictions  across  phase  space.  Indeed, 
they  are  in  closer  agreement  with  each  other  than 
with  the  observations.  This  is  shown  in  fig.  4b 
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Fig.  2.  An  extract  of  the  results  of  applying  reconstruction  A  to  time  series  b.  (a)  the  observed  (solid)  and  predicted  (symbol) 
temperature  values,  (b)  the  error,  and  (c)  nearest  center  distance  for  the  point  from  which  the  prediction  was  made. 


which  is  a  scatter  plot  of  the  prediction  of  recon- 
struaion  A,  where  ^{r)  =  against  that 

with  <^(r)  =  r^.  Panel  4a  is  a  similar  plot  with 
reconstruction  A  against  the  observations.  For 
small  observed  values,  the  two  predictors  remain 
in  rough  agreement  although  both  are  inaccu¬ 
rate;  the  inaccuracy  results,  in  part,  from  the  low 
weighting  the  least  squares  tit  assigns  to  the  less 
commonly  observed  values.  By  adjusting  the  the 
weights,  Wi,  we  can  force  the  distribution  of  er¬ 
rors  to  be  more  uniform  over  the  reconstruction. 

The  cumulated  predictor  error  profiles  show 
a  slow  decay  in  predictability  as  Tp  increases 
comparable  to  Read’s  Lyapunov  estimate  of 
1.79  X  10"^  bits  per  second  (or  one  bit  per  ad- 
vection  period).  As  these  values  are  small,  it 


may  be  argued  that  they  are  numerically  zero 
and  the  system  is  not,  in  faci,  chaotic.  One 
alternative  is  the  parametric  drift  mentioned 
above.  We  give  evidence  below  that  this  is  not 
the  case.  Of  course,  the  significance  of  the  Lya¬ 
punov  exponent  can  be  addressed  directly  via 
comparison  with  the  distribution  of  values  ob¬ 
tained  from  surrogate  sets;  where  the  Lyapunov 
exponent  estimates  from  the  surrogate  signals 
are  used  to  define  the  expected  range  of  values 
to  be  considered  as  computationally  equivalent 
to  zero.  It  should  also  be  noted  that  a  different 
predictor  was  constructed  for  each  of  these  pre¬ 
diction  times  (i.e.  a  direct  predictor  for  each 
value  of  Tp).  The  decay  of  predictability  with 
time  in  this  instance  may  differ  from  the  case  of 
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loq2 (error) 

Fig.  3.  Cumulated  predictor  error  profile  for  a  reconstruction  with  Mc  =  128,  A/  =  5,  Tp  =  18  (8ts)  and  (f>{r)  = 

Tj  =  4  (8ti)  (solid),  Mr)  =  r<j  =  4  (8ts)  (long  dashed),  4>{r)  =  =  I  (8ts)  (short  dashed).  The  horizontal 

axis  is  the  base-2  logarithm  of  the  error. 


an  iterated  fixed-step  predictor. 

Rather  than  estimate  the  Lyapunov  exponents 
of  surrogate  series,  we  investigate  the  signifi¬ 
cance  of  observing  this  level  of  predictability. 
In  particular,  whether  predictions  of  similar  ac¬ 
curacy  would  be  found  in  other  signals  with  the 
same  autocorrelation  function.  To  do  so  we  con¬ 
struct  surrogate  series  with  a  FT  surrogate  gen¬ 
erator  and  consider  the  prediction  of  a  recon¬ 
struction  with  parameters  identical  to  those  of 
reconstruction  A  above.  The  resulting  cumula¬ 
tive  error  profiles  are  shown  in  fig.  5.  Eight  sur¬ 
rogate  series,  each  with  a  different  set  of  random 
phases  was  analyzed,  the  results  shown  have  the 
lowest  (best)  average  absolute  predictor  error. 
Considering  the  error  to  which  S0%  of  the  series 
is  predicted,  reconstruction  A  is  almost  one  bit 
lower,  implying  that  the  prediction  error  is  al¬ 
most  a  factor  of  2  less,  easily  distinguishing  the 
observed  data  from  the  surrogate  series. 

Noting  that  the  distributions  do  not  appear 
to  be  Gaussian,  we  can  reject  the  hypothesis 
that  these  two  realizations  either  have  the  same 
mean  (via  the  t-test)  or  were  generated  from 


the  same  distribution  (via  the  Kolmogorov- 
Smimov  test)  at  well  over  the  0.99  confidence 
level.  We  wish  to  stress  both  the  significance  and 
limitations  of  this  statement.  The  surrogate  gen¬ 
erator  here  preserved  the  autocorrelation  func¬ 
tion  of  the  data  set,  and  the  radial  basis  function 
predictor  easily  distinguished  five-dimensional 
reconstructions  of  these  two  signals.  What  we 
have  really  shown  is  that  this  5D  reconstruction 
is  more  coherent  than  that  of  these  surrogate 
data  sets.  This  is  somewhat  different  from  es¬ 
tablishing  that  the  data  arise  from  a  determin¬ 
istic  five-dimensional  system.  Further  evidence 
that  the  data  do  in  fact  reflect  low  dimensional 
dynamics  in  provided  by  the  correlation  integral 
results. 

As  an  additional  test,  we  construct  a  “surro¬ 
gate  predictor”  to  determine  whether  the  pre¬ 
dictability  of  this  signal  is  only  due  to  the  advec- 
tion  of  a  slowly  evolving  signal.  Look  again  at  the 
observational  and  the  surrogate  version  of  series 
b  in  fig.  1;  the  coherence  between  one  “wave” 
and  the  next  is  stronger  in  the  real  signal  than  in 
the  surrogate  data.  The  surrogate  predictor  sim- 


Fig.  4.  Scatter  plots  of  the  predictions  of  reconstruction  A  (horizontal)  against  (a)  the  observed 
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log2  (error) 

Fig.  5.  Cumulated  predictor  error  profile  for  reconstruction  A  on  (a)  observed  (solid)  and  (b)  surrogate  (long  dashed) 
series.  And  the  period  1  surrogate  predictor  in  (a)  observed  (short  dashed)  and  (b)  surrogate  (dotted)  series. 


ply  projects  forward  the  last  observed  point  an 
integer  number  of  advection  periods  in  the  past. 
The  cumulative  error  profile  for  this  predictor 
is  shown  in  fig.  5  for  both  the  observed  data  set 
(short-dashed  line)  and  the  surrogate  (the  right 
most,  fine  dotted  line).  Although  it  is  not  as  ac¬ 
curate  as  reconstruction  A,  this  one  dimensional 
predictor  clearly  differentiates  between  the  true 
signal  and  these  surrogates,  demonstrating  one 
limitation  of  the  FT  surrogate  generator  in  this 
case.  (This  originates,  in  part,  from  the  loss  of 
phase  coherence  of  the  periodic  advection  signal 
in  the  FT  surrogates,  as  noted  above.)  To  estab¬ 
lish  that  a  system  is  chaotic  through  surrogate 
signals,  we  would  have  to  reject  all  nonchaotic 
surrogates;  this  is  clearly  not  feasible,  and  high¬ 
lights  the  importance  of  the  selection  of  surro¬ 
gate  signals. 

The  inhomogeneity  of  the  spatial  distribution 
of  points  in  the  series  b  reconstruction  is  re¬ 
flected  in  fig.  6  which  shows  a  histogram  of  the 
number  of  times  each  center  is  nearest  to  the 
point  from  which  a  prediction  is  made  in  the 
test  set.  It  is  convenient  to  use  this  partition  of 
the  phase  space  by  nearest  center  to  examine  the 


variation  of  predictor  error  with  location  as  well. 
This  is  shown  in  fig.  7  which  will  be  used  to  pre¬ 
dict  the  error  associated  with  each  prediction  of 
the  time  series  below.  Examining  the  distribu¬ 
tion  of  errors  in  the  learning  set  about  individual 
centers  provides  examples  where  the  data  den¬ 
sity  is  high  and  the  average  error  is  below  the 
global  average.  Simultaneously,  the  error  distri¬ 
bution  about  some  other  center  with  fewer  as¬ 
sociated  points  may  be  very  broad.  On  the  as¬ 
sumption  that  this  is  due  to  additional  structure 
in  the  flow  near  the  latter  center,  additional  res¬ 
olution  should  be  placed  in  this  region  instead  of 
the  denser  region  where  the  flow  is  already  well 
described. 

4. 1.  Predicting  predictability 

We  have  seen  that  the  predictability  of  a  re¬ 
construction  varies  with  location,  due  to  both 
the  underlying  dynamics  of  the  system  and  the 
weighting  scheme  used  in  the  construction  of  the 
predictor  itself.  We  now  quantify  this  variability 
and,  in  so  doing,  estimate  the  uncertainty  asso¬ 
ciated  with  each  individual  prediction.  This  will 
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Fig.  6.  Histogram  of  number  of  predictions  made  as  classified  by  the  nearest  center  to  the  point  from  which  the  prediction 
was  made.  The  data  is  from  the  test  set  of  reconstruction  A. 


allow  us  to  estimate  the  error  associated  with  a 
prediction  at  the  time  the  prediction  is  made  and 
thus  forecast  error  bars  as  well  as  expected  val¬ 
ues. 

When  estimating  the  probable  errors  associ¬ 
ated  with  each  location  in  the  reconstruction  we 
again  use  the  partition  provided  by  the  centers. 
Let  /In  (Jf )  equal  the  index  of  the  center  nearest 
to  the  point  x,  that  is 

nji(x)  =  j,  (4.1) 

where  (Hx-x^lD  is  the  minimum  value  of  (||x- 
x^lD  over  all  centers  k. 

At  the  beginning  of  the  test  set,  initial  esti¬ 
mates  can  be  drawn  from  the  histograms  of  the 
learning  set.  When  very  large  quantities  of  data 
are  available,  one  may  estimate  the  mean  and 
standard  deviation  error  associated  with  each 
center  (or  even  examine  each  distribution).  We 


note  that,  in  some  examples,  the  distributions 
are  far  from  Gaussian  (e.g.  bimodal)  and  the 
distribution  of  positive  errors  is  very  different 
from  that  of  negative  errors.  In  shorter  sets  where 
many  centers  may  have  only  a  few  (<  3)  tests, 
we  have  found  it  useful  to  define  the  average 
positive  predictor  error  associated  with  the  jth 
center  as 

E;  =  (4.2) 

where  S  is  the  Kronecker  delta,  is  the  error 
associated  with  the  A:th  prediction  (i.e.  the  pre¬ 
dicted  value  minus  the  observed)  and  the  sum 
is  over  all  k  such  that  >  0.  Ej  is  similarly 
defined  for  ek  <  0.  These  average  positive  and 
negative  errors  are  then  used  as  predicted  error 
bars  for  future  x  such  that  /in(x)  =  j.  Positive 
and  negative  errors  are  considered  separately,  so 
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center 

Fig.  7.  Histogram  of  local  error  from  the  test  set  of  reconstruction  A  classified  according  to  the  center  nearest  to  the  point 
predicted  as  in  fig.  6. 


asymmetries  in  the  predictability  are  preserved. 

Two  short  sequences  taken  from  the  annulus 
temperature  series  b  are  shown  in  fig.  8  where 
the  time  scale  is  increased  from  that  of  fig.  2  to 
make  individual  predictions  more  clear.  In  this 
case,  the  predictions  were  made  18  (8ts)  steps 
ahead  corresponding  to  almost  two  horizontal 
tick  marks.  Panel  A  shows  a  typical  result;  note 
that  the  expected  error  is  often  asymmetrically 
distributed  about  zero.  This  implies  that  the  ab¬ 
solute  value  of  the  average  prediction  error  could 
be  reduced  by  adding  a  constant  to  each  predic¬ 
tion,  the  value  of  which  was  dependent  upon  the 
nearest  center,  /in(x).  This  would  improve  the 
predictions  but  at  the  cost  of  a  smooth  predictor 
(local  nonlinear  predictors  should  provide  even 
lower  predictor  error,  and  can  function  at  lower 
data  densities  than  local  linear  methods).  Mod¬ 
ifying  the  weighting  scheme  used  in  construct¬ 


ing  the  predictor  provides  an  alternative  global 
approach  which  preserves  smoothness. 

The  more  striking  result  is  the  reliability  of 
the  estimated  error;  predictions  which  lie  in 
portions  of  the  time  series  with  sharp  vertical 
displacements  have  large  estimated  errors,  the 
slowly  changing  portions  expected  to  be  more 
predictable  tend  to  have  smaller  estimated  er¬ 
rors  which  are  reflected  in  the  observed  error. 

In  addition  to  their  practical  value,  these  esti¬ 
mates  can  be  used  to  identify  regions  of  the  re¬ 
construction  with  greater  instability  and  to  dis¬ 
tinguish  outliers  from  variation  due  to  this  in¬ 
stability.  The  only  instances  of  persistently  mis¬ 
leading  results  noted  thus  far  occur  when  the  tra¬ 
jectory  explores  a  portion  of  the  phase  space  not 
visited  in  the  learning  set;  this  condition  can  of¬ 
ten  be  identified  by  an  increase  in  the  nearest 
neighbor  distance  as  noted  above.  Persistently 
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Fig.  8.  A  section  from  the  learning  set  of  reconstruction  A  where  the  temperature,  its  forecast  value  and  the  predicted 
uncertainty  are  plotted.  Here  Tp  =  18  (8t,). 


poor  predictions  and  error  estimates  may  also 
indicate  sensor  failure  or  a  change  in  the  dynam¬ 
ics  of  the  physical  system,  an  application  devel¬ 
oped  in  refs.  [55,56]. 

4.2.  Parametric  drift 

To  conclude  this  section  we  address  the  ques¬ 
tion  of  slow  parametric  drift  over  the  course  of 
the  experiment.  If  this  were  to  occur,  the  dy¬ 
namics  at  different  points  in  time  would  be  best 
determined  from  learning  sets  located  nearby  in 
time.  To  see  if  this  is  indeed  the  case,  the  learn¬ 
ing  set  above  (from  the  initial  4096  points)  was 


tested  on  the  remaining  time  series  divided  into 
thirds.  Computing  the  Kolmogorov-Smimov 
statistic,  d,  between  the  out-of-sample  error  dis¬ 
tributions  from  a  given  predictor  on  different 
sections  of  the  series  [23],  we  find  that  the  null 
hypothesis  that  the  observed  distributions  arise 
from  the  same  distribution  function  cannot  be 
rejected  at  the  90%  level  of  confidence.  If  this 
test  is  applied  directly  to  the  two  halves  of  se¬ 
ries  b,  we  have  i/obs  0-04  and  probability  (t/  > 
^^obs)  0-86.  When  applied  to  the  data  series, 
the  Kolmogorov-Smimov  test  indicates  that 
the  range  and  distribution  of  the  data  did  not 
change;  when  applied  to  the  error  distributions, 


68 


L.A.  Smith  /  Low  dimensional  dynamics 


it  indicates  that  the  quality  of  the  predictor  did 
not  change  and  hence  is  evidence  against  slow 
parametric  drift.  It  does  not,  of  course,  rule  out 
recurrent  parametric  drift  on  time  scales  short 
relative  to  the  length  of  the  series.  There  exists 
no  definitive  method  to  do  so  as,  in  practice,  the 
distinction  between  “parameters”  and  “system 
variables”  becomes  a  philosophical  question 
when  both  vary  on  small  time  scales. 


5.  Correlation  dimension 

The  Grassberger-Procaccia  Algorithm  or 
GPA  [14],  which  estimates  the  correlation  di¬ 
mension,  di,  provides  a  direct  measure  of  the 
geometry  of  a  distribution  and  has  become  per¬ 
haps  the  most  widely  used  tool  in  the  search 
for  low  dimensional  dynamics.  It  is  described 
in  detail  in  [57,58].  Briefly  stated,  one  wishes 
to  estimate  the  correlation  integral  C2{t)  of  a 
distribution  of  points  x: 

C2(f)  = 

number  of  pairs  of  points  separated  by  less  than  t 
total  number  of  pairs  of  points 

=  probability  (||X;  —  Xy||  <  ^  ),  (5.1) 

where  x,  and  xj  are  two  randomly  chosen  points 
in  the  set.  It  is  implicitly  assumed  here  that  one 
is  selecting  from  the  set  of  all  possible  pairs  of 
points  on  the  attractor.  This  is  not  the  case  with 
reconstructions  from  time  series  when  the  spa¬ 
tial  separation  between  a  pair  of  points  reflects 
that  they  are  close  in  time,  Theiler  [59]  demon¬ 
strated  that  for  smooth  dynamical  systems,  con¬ 
sideration  of  points  close  in  time  can  lead  to  one¬ 
dimensional  “knees”  in  correlation  integral  esti¬ 
mates.  More  recently,  Osborne  and  Provenzale 
[60]  have  found  finite  correlation  dimensions 
for  power  law  noises,  but  these  are  another  case 
of  this  same  effect  and  need  not  foil  dimension 
estimates  in  practice  [61  ].  A  simple  test  for  de¬ 
tecting  such  effects  is  given  in  ref  [9].  Taking 
care  that  these  effects  are  minimal,  tlie  correla¬ 


tion  integral  is  approximated  as 

l=\  j=\ 

(5.2) 

where  I  is  the  length  scale  and  ©(x )  is  the  Heavi¬ 
side  function  which  is  equal  to  zero  for  negative 
argument  and  one  otherwise.  When  the  limit  is 
not  taken,  the  sums  over  i  and  j  should  be  re¬ 
stricted  so  that  |/  -  j\  >  W  [59],  Numerically 
efficient  methods  for  evaluating  the  correlation 
integral  are  available  [62,63]. 

In  the  limit  of  small  i,  we  expect  Ci  )  to  be 
scaling,  that  is 

C2{i)  (5.3) 

which  defines  d2,  the  correlation  exponent  and 
xH)  accounts  for  lacunarity  effects  [64,65]. 
At  finite  length  scales,  one  can  inspect  the  lo¬ 
cal  slope  of  log2  Ct  (f )  as  a  function  of  log2  {£ ) 
for  a  scaling  range  over  which  to  estimate  d2- 
When  estimating  d2,  the  i  =  j  terms  in  the 
sum  should  be  neglected,  although  it  is  useful 
to  compute  C2(f)  with  both  normalizations 
(this  involves  negligible  computational  cost) 
and  compare  their  slopes  as  functions  of  logf . 
Both  curves  provide  useful  information  in  judg¬ 
ing  the  quality  and  evolution  of  reconstructions 
with  changes  in  the  embedding  dimension. 

There  has  been  a  great  deal  of  discussion  in  the 
literature  regarding  the  amount  of  data  required 
to  obtain  a  meaningful  estimate  of  the  character¬ 
istics  of  chaotic  dynamical  systems.  For  the  cor¬ 
relation  exponent,  several  authors  [66-69]  have 
provided  estimates  of  the  minimum  “number  of 
data  points”  required.  Unfortunately,  it  is  not 
easy  to  determine  the  number  of  data  points  in 
a  time  series  in  this  sense.  The  difficulty  lies  in 
assumptions  which  require  the  data  to  be  spread 
uniformly  with  respect  to  some  underlying  prob¬ 
ability  density  (measure).  Appeals  to  ergodicity 
are  of  no  use  when  the  sampling  rate  is  such  that 
consecutive  measurements  are  dynamically  cor¬ 
related,  for  this  biases  the  correlation  integral  by 
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increasing  the  probability  of  a  pair  of  points  at 
separation  ^  +  <5,  given  that  there  is  a  pair  with 
separation  t.  The  dynamical  correlation  time  is 
also  very  difficult  to  estimate  a  priori,  it  is  cer¬ 
tainly  not  the  linear  correlation  \me  (i.e.  the 
first  zero  of  the  linear  autocorrelation  function, 
Tamo)  or  the  first  minimum  of  the  one  dimen¬ 
sional  mutual  information  [68  ].  A  second  draw¬ 
back  of  these  scaling  arguments  is  that  they  pro¬ 
vide  necessary  but  not  sufficient  conditions,  and 
the  former  are  of  much  less  use  than  a  measure 
of  success.  It  is  easily  shown  that  through  smooth 
deformation  of  a  reconstruction,  one  can  always 
increase  the  number  of  data  points  required  to 
obtain  a  good  estimate  of  a  dimension. 

Like  all  analysis  techniques,  the  GPA  must 
be  applied  with  some  insight.  There  has  been 
much  discussion  about  the  a  priori  knowledge  of 
the  system  required  to  apply  this  algorithm.  We 
would  liken  application  of  the  GPA  to  that  of  the 
Fast  Fourier  Transform  (FFT):  one  must  under¬ 
stand  the  algorithm  and  its  limitations  when  in¬ 
terpreting  the  results.  To  push  the  analogy,  one 
rarely  hears  reports  of  strong  power  in  a  fre¬ 
quency  beyond  Nyquist  limit,  or  public  argu¬ 
ments  over  whether  it  is  really  necessary  to  have 
a  stationary  time  series  to  apply  the  FFT.  Such 
results  would  reveal  a  false  application  of  the  al¬ 
gorithm,  not  a  flaw  in  it.  Nor  is  it  claimed  that 
one  must  understand  the  physics  of  the  system  to 
gain  useful  information  from  a  power  spectrum. 
The  analogy  holds. 

The  analogy  fails  when  the  difficulty  of 
coding  the  algorithm  is  considered,  and  the 
general  knowledge  of  its  limitations.  Particu¬ 
lar  errors  application  are  well  documented 
[59,66,58,57,70]  although  they  are  still  com- 
mo”..  Even  so,  these  are  again  necessary,  not 
sufficient  conditions.  Even  when  precautions 
are  taken,  one  would  like  to  estimate  the  prob¬ 
ability  that  a  given  result,  with  specified  recon¬ 
struction  parameters  (delay  times  or  SVD  win¬ 
dow  lengths,  ^  ic)  is  not  due  to  such  factors  as 
the  length  ct  the  data  set.  This  is  the  strength  of 
employing  surrogate  signals. 


5. 1.  Surrogate  signals  and  the  correlation 
integral 

The  FT  surrogate  generator  is  now  used  to 
evaluate  the  results  obtained  for  the  rotating  an¬ 
nulus  data.  Rather  than  attempt  to  automate  the 
choice  of  a  scaling  region,  fig.  9  presents  the 
slope  of  the  correlation  integral  from  series  b 
data  along  with  that  of  a  representative  surrogate 
set.  The  solid  (short-dashed)  lines  represent  the 
slope  of  the  log2  C2  (^ )  against  log2  ^  including 
(excluding)  the  i  =  j  points  in  the  sum.  The 
regular  long-dashed  line  is  the  expected  slope  for 
white  noise  with  the  same  diameter  (see  [66] ). 
The  difference  between  observed  and  surrogate 
graphs  is  striking  both  in  the  value  of  the  plateau 
(if  one  can  be  said  to  exist  for  the  surrogate  set) 
and  the  relative  location  of  the  nearest  neighbor 
distances  as  reflected  by  the  value  of  log2  (^ )  at 
which  the  curve  including  the  i  =  j  points  re¬ 
turns  to  zero.  It  is  tempting  to  define  a  scaling 
range  and  determine  the  probability  of  observ¬ 
ing  the  value  for  the  surrogate  sets.  Such  a 
calculation  is  questionable  as  the  algorithm  has 
not  converged  in  the  case  of  this  surrogate  se¬ 
ries  and  there  is  no  saturation  as  the  embedding 
dimension  is  increased.  This  detracts  little  from 
the  argument  that  the  observed  series  has  a  sig¬ 
nificantly  different  correlation  integral  than  ex¬ 
pected  due  to  its  autocorrelation  function. 

We  now  examine  the  series  d  data  set,  consid¬ 
ered  to  be  quasi-periodic  (two  incommensurate 
frequencies)  by  Read  et  al.  [13].  The  slope  of 
the  correlation  integral  with  length  scale  for  this 
set  is  shown  in  fig.  10.  Note  that  it  does  appear  to 
be  about  two-dimensional.  The  feature  at  length 
scales  -2.5  <  log2(^)  <  -1,  reflects  the  macro¬ 
scopic  structure  of  the  reconstruction  and  not  an 
artifact  of  the  analysis.  Assuming  that  the  recon¬ 
struction  is  a  two-torus,  this  would  indicate  that 
it  has  greater  extent  in  “one  direction”,  the  true 
dimension  of  the  distribution  is  not  observed  un¬ 
til  length  scales  smaller  than  5  are  reached.  Even 
if  a  surrogate  generator  preserves  the  geometry  of 
the  two-torus,  this  particular  macroscopic  struc- 
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Fig.  9.  Local  slope  of  the  correlation  integral  from  series  b  This  analysis  utilized  2'^  data  points,  (a)  observed  and  (b)  and 
surrogate. 


Fig.  i  o  Local  slope  of  the  correlation  integral  from  series  k. 

ture  need  not  be  maintained;  macroscopic  dis¬ 
tortions  will  shift  the  scaling  range,  confounding 
attempts  to  compare  correlation  dimension  es¬ 
timates  between  observed  and  surrogate  signals 
even  in  a  case  where  they  both  converge  in  the 
limit  of  small  The  point  here  is  that  two  topo¬ 


logically  equivalent  distributions  with  different 
macroscopic  structure  will  have  different  coi  re¬ 
lation  integrals  at  large  scales.  This  is  a  funda¬ 
mental  limitation  inherent  in  the  geometric  anal¬ 
ysis  of  reconstructions,  and  provides  an  example 
where  the  lower  bounds  on  data  requirements  for 
dimension  calculations  are  vast  underesximales 
of  the  true  amount  of  data  required  for  this  type 
of  analysis.  (In  this  particular  case,  the  Fourier 
spectrum  indicated  a  quasi-periodic  attractor.) 
It  also  provides  an  example  of  where  surrogate 
series  can  provide  misleading  results  if  a  fixed 
scaling  range  is  used. 


6.  Nonlinear  prediction  of  stochastic  systems 

The  arguments  above  demonstrate  that  non¬ 
linear  predictors  can  distinguish  dynamical  sys¬ 
tems  with  a  structured  phase  space  flow  from 
those  whose  motion  in  phase  space  is  incoherent. 
If  we  identify  the  former  systems  as  determinis¬ 
tic  and  the  latter  as  stochastic,  we  have  a  good 
test  for  determinism.  Unfortunately  such  a  clas¬ 
sification  will  consider  many  classic  “stochas- 
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tic”  systems  as  deterministic  as  it  fails  to  dis¬ 
tinguish  “determinism”  from  stochastic  systems 
which  are  “low  dimensional”  in  the  sense  that 
they  are  associated  with  a  probabilistic  flow  in  a 
low  dimensional  phase  space. 

In  this  section  we  shall  consider  two  systems, 
the  disturbed  pendulum  of  Yule  [71]  and  the 
Omstein-Uhlenbeck  process [72,73].  The  dis¬ 
turbed  pendulum  provides  an  example  of  a  de¬ 
terministic  system  where  noise  feeds  back  into 
the  dynamics  (dynamic  noise)  rather  than  being 
superimposed  on  the  measurements  (observa¬ 
tional  noise).  The  Omstein-Uhlenbeck  process 
has  become  a  paradigm  of  stationary  stochastic 
systems.  We  demonstrate  that  nonlinear  deter¬ 
ministic  predictors  provide  a  good  approxima¬ 
tion  to  optimal  prediction  of  this  system  and 
indicate  the  difflculties  this  implies  for  tests  of 
determinism  using  surrogate  signals.  We  are  not 
interested  here  in  establishing  whether  one  type 
of  nonlinear  deterministic  predictor  is  better 
than  another,  but  in  their  common  properties. 

Yule  considered  two  simple  models  to  ac¬ 
count  for  the  lack  of  simple  periodicity  in  the 
IS  sunspot  cycles  then  available.  Both  models 
are  based  on  observations  of  a  pendulum.  In 
the  first,  the  observations  of  perfect  periodicity 
are  subject  to  superposed  fluctuations  or  obser¬ 
vational  noise.  In  this  case,  for  sufficiently  long 
series,  Fourier  analysis  will  detect  the  under¬ 
lying  periodicity.  Any  deterministic  predictors 
which  allow  for  observational  noise  should  do 
so  as  well.  In  the  second  case,  the  observational 
noise  is  considered  negligible,  but  disturbances 
to  the  pendulum’s  motion  (caused  by  boys  with 
pea  shooters)  change  the  energy  of  the  pendu¬ 
lum  and  feed  back  into  the  systems  dynamics. 
When  the  shocks  are  well  separated  in  time, 
nonlinear  deterministic  predictors  will  give  ex¬ 
cellent  predictions  (between  the  shocks)  due  to 
the  structure  of  the  underlying  two-dimensional 
phase  space  of  the  pendulum.  Good  predictions 
are  possible  as  long  as  the  expected  time  inter¬ 
val  between  impacts.  At,  is  not  small  relative 
to  the  sum  of  the  reconstruction  window  and 


prediction  time  or 

At  >  (m-  l)Td  4-  Tp.  (6.1) 

The  Omstein-Uhlenbeck  process  models  the 
velocity  of  a  Brownian  particle.  From  a  dynam¬ 
ical  systems  perspective,  it  is  preferable  to  con¬ 
sider  the  velocity,  u{t),  rather  than  the  displace¬ 
ment,  as  the  velocity  time  series  is  stationary. 
The  change  in  the  velocity,  dM(0,  is  given  by 

du(t)  =  -Pu{t)dt  +  oy(t)'J^t,  (6.2) 

where  y(0  is  a  random  Gaussian  process  with 
zero  mean  and  unit  variance,  dt  is  the  time  step, 
and  the  parameters  P  and  a  are  related  to  fric¬ 
tional  drag  and  the  driving  impacts  respectively. 
The  optimal  (statistical)  predictor  for  this  pro¬ 
cess  is  known;  given  the  initial  condition  uq,  the 
expected  value  of  u{t)  is 

Aheory(M(0)  =  E{u(t)  \  U{0)  =  Uq) 

=  Moe-^'.  (6.3) 

Estimates  for  the  variance  are  also  available 
[73].  To  test  whether  the  reconstructed  dy¬ 
namics  finds  this  structure,  a  2048  point  learn¬ 
ing  set  (P  =  0.5,  a  =  \.0,  dt  =  0.05)  with 
m  =  1,  He  =  64  and  0(r)  =  r  was  constructed 
and  tested  out-of-sample  on  an  additional  2048 
points.  The  point  here  is  not  whether  this  radial 
basis  function  predictor  is  optimal,  but  is  merely 
to  demonstrate  that  any  good  dynamic  recon¬ 
struction  should  identify  this  structure  in  an 
Omstein-Uhlenbeck  series  (or  a  series  from  a 
stochastic  model  like  that  of  Barnes  et  al.  [27] ). 

We  compare  the  predictor  with  F,heory  by  plot¬ 
ting  the  predicted  future  value  against  the  cur¬ 
rent  observed  value  in  fig.  11.  The  solid  line  cor¬ 
responds  to  Ftheory-  A  Scatter  plot  of  the  observed 
future  value  against  the  current  observed  value 
shows  a  wide  distribution.  Both  the  agreement 
and  disagreement  between  the  deterministic  pre¬ 
dictor  and  the  expected  value  displayed  in  fig. 
1 1  is  understood.  The  largest  values  of  u  in  the 
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log2  u(t) 

Fig.  11.  Prediction  of  an  Omstein-Uhlenbeck  process.  The  solid  line  denotes  the  optimal  prediction.  The  predictor  appears 
to  be  double  valued  because  predictions  for  both  positive  and  negative  x  are  superimposed.  The  very  inaccurate  predictions 
at  large  x  occur  when  the  system  explores  a  region  of  phase  space  not  visited  in  the  learning  set.  At  small  x,  the  variance 
in  the  expected  value  is  large  and  the  ability  of  the  prediaor  to  find  the  expected  value  is  diminished. 


test  set  correspond  to  very  poor  predictions  (the 
markers  on  the  right  side  of  the  plot  well  be¬ 
low  the  ideal  line);  these  points  have  large  near¬ 
est  center  distances  and  correspond  to  values 
of  u  not  visited  in  the  learning  set.  For  slightly 
smaller  uq  there  is  good  agreement  between  the 
two  predictors;  the  two  images  of  the  determin¬ 
istic  predictor  are  the  superimposed  values  for 
positive  and  negative  uq.  For  small  Uq  there  is 
poor  agreement  between  the  two  predictors;  due 
to  both  the  magnification  of  small  distortions  by 
the  logarithmic  scales  and  the  increase  in  uncer¬ 
tainty  of  future  values  of  initial  conditions  near 

M  =  0. 

To  the  extent  that  these  systems  are  deter¬ 
ministic,  the  dynamic  reconstructions  quantify 
their  behavior.  Yet  they  are  stochastic  in  the 
sense  that  the  current  state  of  the  system  does 
not  completely  define  its  future.  Once  the  de¬ 
terministic  structure  is  quantified,  the  quality 
of  the  predictions  should  not  improve  regard¬ 


less  of  increases  in  the  amount  of  data  available. 
The  lack  of  improved  precision  with  increasing 
data,  embedding  dimension  or  changing  delay 
time  (for  infinite  data  sets)  provides  an  indica¬ 
tion  of  the  stochastic  component  of  the  process 
but  is  difficult  to  establish  with  finite  data  sets. 
In  these  particular  cases,  examining  the  predic¬ 
tor  error  series  of  the  pendulum,  and  its  spatial 
variation  in  the  O-U  process,  could  help  iden¬ 
tify  the  dynamics  of  the  processes.  The  situa¬ 
tion  is  more  complicated  in  stochastic  systems 
with  more  complex  (higher-dimensional)  phase 
space  structure.  Here  nonlinear  dynamics  comes 
from  nonlinear  structure  in  the  governing  equa¬ 
tions  regardless  of  whether  they  are  stochasti¬ 
cally  or  deterministically  driven.  As  the  structure 
of  the  governing  equations  increases,  the  nature 
of  the  stochastic  forcing  may  become  less  appar¬ 
ent.  This  holds  implications  for  the  use  of  surro¬ 
gate  data,  in  that  surrogate  generators  which  de¬ 
stroy  this  structure  will  be  distinguished  regard- 
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less  of  whether  the  underlying  driving  force  is 
stochastic  or  deterministic. 

One  approach  to  recover  this  distinction  is 
to  look  to  longer  time  scales.  Stone  [54]  has 
considered  a  Duffing  oscillator  driven  either  si¬ 
nusoidally  (chaos)  or  by  random  perturbations 
(stochasticiiy)  and  shown  that  the  signals  from 
these  two  systems  are  similar  in  terms  of  power 
spectra  and  symbolic  dynamics.  The  short  time 
predictability  of  these  signals  is  similar  as  well; 
however  if  one  considers  long  time  phenomena 
the  two  cases  can  be  distinguished.  In  particular, 
the  series  of  time  intervals  between  departures 
from  the  origin  is  distinguished  by  the  predic¬ 
tor  presented  here.  These  return  times  are  effec¬ 
tively  independent  and  identically  distributed  in 
the  stochastically  forced  system  while  the  chaotic 
system  is,  initially,  predictable  and  the  distribu¬ 
tion  of  predictor  error  for  return  times  appears 
to  relax  to  the  same  distribution  as  the  stochastic 
case  as  predictions  are  made  farther  into  the  fu¬ 
ture.  As  the  nonlinear  structure  of  these  two  sys¬ 
tems  is  identical,  they  provide  a  useful  example 
of  the  similarities  and  differences  of  stochastic 
and  deterministic  behavior. 


7.  Discussion 

It  is  customary  to  consider  low  dimensional 
determinism  and  stochasticity  as  two  clearly  dis¬ 
tinct  types  of  behavior.  As  we  have  seen,  dis¬ 
tinguishing  between  these  alternatives  is  some¬ 
times  difQcult.  We  have  presented  a  general  ap¬ 
proach  to  evaluating  algorithms,  which  attempts 
this  distinction  through  contrasting  the  results  a 
given  algorithm  produces  on  the  observed  data 
with  those  produced  from  surrogate  data.  The 
importance  of  choosing  a  good  surrogate  gener¬ 
ator  has  been  stressed  and  the  general  effective¬ 
ness  of  this  approach  has  been  demonstrated  for 
correlation  exponent  calculations  and  prediction 
algorithms  on  laboratory  data. 

We  have  focused  our  attention  primarily  on 
the  rotating  annulus  experiments  and  estab¬ 


lished  that  these  data  sets  differ  significantly 
from  the  surrogate  series  considered.  This  gives 
us  confidence  that  dynamical  systems  tech¬ 
niques  can  provide  a  better  understanding  of 
this  system,  in  particular  in  determining  the  na¬ 
ture  of  the  underlying  driving  mechanisms.  This 
goal  is  difficult  to  obtain  with  the  data  in  the 
form  presented  here  for,  while  it  may  display 
deterministic,  “low  dimensional”  behavior,  the 
physics  in  delay  space  is  not  at  all  simple.  Dy¬ 
namical  systems  texts  often  give  the  impression 
that  a  system  which  evolves  on  a  low  dimen¬ 
sional  (say  di  <  S)  attractor  has  simple  physics. 
This  is  somewhat  misleading.  For  a  set  of  five 
ordinary  differential  equations  (ODE’s)  it  is 
true,  perhaps  even  for  a  set  of  1 0  ODE’s  which 
collapse  onto  such  an  attractor. 

For  a  large  physical  system  with  many  degrees  of 
freedom,  the  dynamics  in  5D  is  certainly  more 
simple  than  not  under  such  restriction,  but  the 
physics  is  a  mess  in  5D.  The  equations  of  motion 
need  not  correspond  to  the  macroscopic  physi¬ 
cal  properties  of  the  system  and  will  almost  cer¬ 
tainly  not  correspond  to  a  set  of  simple  ODE’s. 
While  a  great  deal  can  be  learnt  from  such  sys¬ 
tems,  it  is  misleading  to  imply  that  the  physics, 
in  a  traditional  sense,  will  become  clear.  Indeed, 
we  may  need  to  develop  a  new  way  of  interpret¬ 
ing  physics  and  it  is  tempting  to  draw  an  analogy 
with  the  way  statistical  mechanics  answers  dif¬ 
ferent  questions  than  classical  dynamics.  An  al¬ 
ternate  approach  which  we  are  currently  pursu¬ 
ing  with  the  annulus  data  is  to  recast  the  data  into 
a  form  in  which  the  physics  is  more  assessable. 
The  spatial  distribution  of  probes  allows  a  spa¬ 
tial  Fourier  transform  into  wavenumber  space. 
With  the  data  in  this  form,  a  multivariable  re¬ 
construction  can  address  the  general  question  of 
predictability  directly,  as  well  as  particular  ques¬ 
tions  concerning  which  mode  interactions  drive 
the  dynamics  of  the  system.  For  example,  sup¬ 
pose  the  data  is  recast  into  a  multivariate  series 

of  the  amplitudes  of  modes  A,  B,  C, _ Using 

the  predictor  discussed  above,  we  plan  to  exam¬ 
ine  the  extent  to  which  the  energy  in  modes  A 
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and  B  determine  the  future  behavior  of  mode 
C,  thereby  directly  testing  cascade  and  mode¬ 
mode  interaction  hypotheses.  Through  this  type 
of  study  we  hope  to  clarify  what  physical  pro¬ 
cesses  dominate  the  dynamics  of  this  system. 

We  have  also  shown  how  dynamic  reconstruc¬ 
tions  can  be  used  to  address  open  questions  con¬ 
cerning  the  experiment  itself.  By  demonstrating 
that  predictors  formed  from  one  segment  of  the 
time  series  yield  statistically  indistinguishable 
errors  when  applied  to  different  segments  of  the 
time  series,  we  have  shown  that  the  complex¬ 
ity  observed  is  not  due  to  slow  parametric  drift 
over  the  duration  of  the  experiment.  The  statis¬ 
tically  significant  difference,  between  both  the 
predictability  and  the  correlation  integrals  of  the 
observed  signals  and  surrogates  generated  with 
identical  autocorrelation  functions,  provides  a 
strong  case  for  low  dimensional  dynamics  in  this 
system.  This  case  is  further  supported  by  the 
demonstration  that  a  simple  prediction  by  mem¬ 
ory  scheme,  while  capable  of  distinguishing  be¬ 
tween  the  FT  surrogates  and  the  observations,  is 
much  less  accurate  than  the  deterministic  radial 
basis  function  predictor. 

In  this  paper,  we  have  applied  a  global,  non¬ 
linear  predictor  based  on  radial  basis  function 
interpolation  which  explicitly  considers  noise  in 
the  data  set  and  the  inhomogeneity  of  the  re¬ 
construction  in  phase  space.  This  type  of  predic¬ 
tor  may  be  improved  in  several  ways.  For  exam¬ 
ple,  the  reconstruction  may  be  altered  to  include 
known  physics  in  the  problem  at  hand:  in  a  prob¬ 
lem  where  diurnal  cycles  are  known  to  be  impor¬ 
tant,  the  time  of  day  could  be  included  by  plac¬ 
ing  the  reconstruction  on-a  circle.  In  systems  like 
the  annulus,  the  predictability  may  be  improved 
by  recasting  the  data  set  into  a  form  in  which  the 
physics  is  more  assessable  as  discussed  above.  It 
is  often  the  case  that  additional  information  re¬ 
garding  the  macroscopic  state  of  the  system  is 
available  in  addition  to  time  series  data.  Exam¬ 
ples  under  investigation  include  laboratory  data 
where  the  phase  of  a  forcing  function  is  known 
[74],  and  meteorological  series,  where  the  gen¬ 


eral  structure  of  the  regional  weather  pattern  is 
included  to  improve  the  prediction  of  local  tem¬ 
perature  series  [75].  For  finite,  noisy  data  sets, 
considerations  such  as  these  may  be  crucial  to 
obtaining  a  significant  result. 

In  addition  to  better  embeddings,  improve¬ 
ments  in  the  prediction  scheme  are  also  possi¬ 
ble  but  are  likely  to  involve  system  specific  an¬ 
swers.  For  example,  the  choice  between  iterative 
forecasting  and  direct  forecasting  may  vary  with 
the  particular  dynamical  system,  the  data  den¬ 
sity,  the  noise  level  and  even  the  details  of  the 
predictor  itself  (local  or  global,  linear  or  non¬ 
linear,  ...).  The  system  specific  nature  of  this 
problem  is  likely  to  reoccur  in  other  details  of 
reconstructions,  such  as  the  importance  of  the 
method  employed  for  choosing  centers.  For  the 
predictor  presented  here,  one  may  improve  the 
method  used  to  account  for  noise;  we  have  ap¬ 
plied  a  straightforward  least  squares  approach. 
Implicit  in  this  approach  is  the  assumption  that 
the  “noise”  is  located  in  the  quantity  being  pre¬ 
dicted  (5),  not  the  base  point  (x).  For  delay  re¬ 
constructions  this  is  certainly  not  the  case,  the 
same  noise  level  is  present  in  the  base  point  as 
in  the  prediction.  One  approach  to  this  problem 
would  be  to  consider  total  least  squares.  This  is 
analogous  to  performing  an  SVD  fit  in  two  di¬ 
mensions  rather  than  a  least  squares  fit  when  it 
is  known  that  there  is  error  in  both  coordinates. 

In  the  attempt  to  distinguish  between  deter¬ 
ministic  and  stochastic  dynamics  through  pre¬ 
diction,  one  complication  has  been  noted:  the 
ability  of  deterministic  predictors  to  identify 
the  expected  values  for  some  stochastic  systems, 
and  thereby  differentiate  them  from  (some) 
surrogates.  This  is  particularly  true  in  effectively 
low  dimensional  stochastic  systems,  systems 
which  exhibit  stochastic  motion  within  a  struc¬ 
tured  low  dimensional  phase  space.  (Although 
as  stochastic  systems  they  remain,  of  course,  in¬ 
finite  dimensional. )  While  such  systems  clearly 
fail  to  follow  strict  Laplacian  determinism,  it 
is  not  clear  how  they  are  best  classified.  Their 
detection  by  nonlinear  prediction  will  depend 
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on  the  particulars  of  what  quantities  and  length 
scales  are  analyzed,  and  useful  classification  in 
the  presence  of  these  effects  may  require  consid¬ 
eration  of  second  order  properties  of  the  predic¬ 
tor.  In  the  event,  these  distinctions  may  require 
a  more  precise  definition  of  what  constitutes 
determinism. 


Acknowledgements 

I  thank  M.  Muldoon  for  many  stimulating  dis¬ 
cussions  and  mutual  information  calculations. 
I  am  happy  to  acknowledge  helpful  discussions 
and  disagreements  with  J.  Theiler,  D.  Broom- 
head,  K.  Fraedrich,  G.  King  and  B.  Mestel, 
and  would  like  to  thank  P.  Read,  M.  Gaster,  T. 
Mullin  for  supplying  data  the  analysis  of  which 
provided  the  insights  reported  here.  M.  Chap¬ 
pell,  C.  Lanone,  P.  Read  and  J.  Theiler  have 
provided  very  useful  criticisms  of  a  draft  of  this 
paper.  Finally,  I  thank  N.  Weiss  for  discussions 
about  chaos  in  the  sun  and  the  sunspot  record. 
This  research  has  been  supported  by  the  Science 
and  Engineering  Research  Council  and  the  US 
Office  of  Naval  Research. 


References 

[1  ]  E.N.  Lorenz,  J.  Atmos  Sci.  20  (1963)  130. 

[2]  D.W.  Moore  and  E.  A  Spiegel,  Astrophys.  J.  143  (1966) 
871-887. 

[31  J.  Crutchfield,  J.D.  Fanner,  N.H.  Packard  and  R.S. 
Shaw,  Sci.  Am.  254  (1986)  46-57. 

[4]  N.  Gershenfeld,  in:  Directions  in  Chaos,  ed.  B.-l.  Hao 
(World  Scientific,  1988)  pp.  310-383. 

[5]  S.  Eubank  and  D.  Farmer,  An  introduction  to  chaos 
and  randomness,  in:  Proc.  SFI  Summer  School,  ed.  E. 
Jen  (Addison-Wesley,  1990). 

[6]  Marquis  de  Laplace  and  Pierre-Simon,  Theorie 
Analytiquedes Probabilities  (Paris,  1820),  reproduced 
in  the  Oeuvres  complies  de  Laplace,  Volume  1 1  ( Paris, 
1886). 

[  7  ]  J.  Theiler,  B.  Galdrikan,  A.  Longtin,  S.  Eubank  and  J.  D. 
Farmer,  in:  Nonlinear  Prediction  and  Modelling,  eds. 
M.  CasdagU  and  S.  Eubank  (Addison-Wesley,  1992) 
pp.  163—188. 

[8]  J.  Theiler,  S.  Eubank,  A.  Longtin,  B.  Galdrikan  and  J. 
D.  Farmer,  Physica  D  58  ( 1 992 )  77,  these  Proceedings. 


[9]  A.  Provenzale,  L.A.  Smith,  R.  Vio  and  G.  Murante, 
Physica  D  58  (1992)  31,  these  Proceedings. 

[10]  E.  Kostelich  and  J.  Yorke,  Physica  D  41  (1990)  183- 
196. 

[11]  L.A.  Smith,  Quantifying  chaos  with  predictive  flows 
and  maps:  locating  unstable  periodic  orbits,  in: 
Measures  of  Complexity  and  Chaos,  eds.  N.B.  Abraham 
etal.  (Plenum,  1990). 

[12]  L.A.  Smith,  Applied  chaos:  computing  unstable 
periodic  orbits  through  predictive  flows  and  maps,  in: 
Information  Dynamics,  eds.  H.  Atmanspacher  et  al. 
(Plenum,  1991). 

[13]  P.  Read,  M.J.  Bell,  D.W.  Johnson  and  R.M.  Small, 
Chaotic  regimes  in  rotating  baroclinic  flow,  J.  Fluid 
Mech.  ( 1991 ),  to  appear. 

[14]  P.  Grassberger  and  1.  Procaccia,  Phys.  Rev.  Lett.  50 
(1983)  346. 

[  1 5  ]  P.  Read,  Applications  of  singular  value  decomposition 
to  the  analysis  of  “baroclinic  chaos”,  Physica  D  58 
(1992)  455,  these  Proceedings. 

[16]  R.L.  Smith,  in:  Nonlinear  Prediction  and  Modelling, 
eds.  M  Casdagli  and S.  Eubank  (Addison-Wesley,  1 992 ) 
pp.  115-136. 

[  1 7  ]  C.  Nicolis  and  G.  Nicolis,  Nature  311  (1 984 )  529. 

[18]  P.  Grassberger,  Nature  323  (1986)  609. 

[19]  C.  Nicolis  and  G.  Nicolis,  Nature  326  (1987)  523. 

[20]  P.  Grassberger,  Nature  326  (1987)  524. 

[21]  1.  Procaccia,  Nature  333  (1988)  498-499. 

[22]  R.vonMises,  Probability  Statistics  and  Truth,  (George 
Allen  and  Unwin,  London,  1957). 

[23]  W.H.  Press,  B.P.  Flannery,  S.A.  Teukolsky  and  W.T. 
Vetterling,  Numerical  Recipes,  (Cambridge  Univ. 
Press,  Cambridge,  1987). 

[24]  A.R.  Osborne,  A.D.  Kirwan,  A.  Provenzale  and  L. 
Bergamasco,  Physica  D  23  (1986)  75-83. 

[25]  J.A.  Eddy,  Science  192  (1976)  1182-1202. 

[26]  E.A.  Spiegel  and  A.  Wolf,  Chaos  and  the  solar  cycle, 
in:  Chaos  in  Astrophysics  Ann.  NY  Acad.  Sci.  Vol.  497 
(1987)  pp.  55-60. 

[27]  J.A.  Barnes,  H.H.  Sargent  and  P.V.  Tryon,  Sunspot 
cycle  simulation  using  random  noise,  in:  The  ancient 
sun  eds.  R.O.  Pepin,  J.A.  Eddy  and  R.B.  Merrill 
(Pergamon,  New  York,  1980). 

[28]  N.O.  Weiss,  Phil.  Trans.  R.  Soc.  London  A  330  (1990) 
617-625. 

[29]  D.S.  Broomhead  and  R.  Jones,  Time-series  analysis, 
Proc.  R.  Soc.  London  423  (1989)  103-121. 

[30]  N.H.  Packai,^.,  J.P.Cruchfield,  J.D.  Farmer  and  R.S. 
Shaw,  Phys.  Rev.  Lett.  45  (1980)  712. 

[31  ]  F.  Takens,  in:  Dynamical  Systems  and  Turbulence,  eds. 
D.  Rand  and  L.-S.  Young  (Springer,  1981 )  p.  366. 

[32]  Th.  Buzug,  T.  Reimers  and  G.  Pfister,  Europhys.  Lett. 
13  (1990)  605-610. 

[33]  M.  Casdagli,  Chaos  and  deterministic  versus  stochastic 
non-linear  modeling,  J.  R.  Statist.  Soc.  B  submitted 
(1991). 

[34]  T.  Sauer,  J.A.  Yorke  and  M.  Casdagli,  J.  Stat.  Phys.  65 
(1991)  579-616. 


76 


L.A.  Smith  /  Low  dimensional  dynamics 


[35]  A.M.  Fraser,  Physica  D  34  (1989)  391-404. 

[36]  J.L.  Breeden  and  N.H.  Packard,  Nonlinear  analysis  of 
data  sampled  nonuniformly  in  time.  Technical  Report 
CCSR-91-15,  Center  for  Complex  Systems  Research, 
Urbana,  II  61801.  (1991). 

[37]  D.S.  Broomhead  and  G.P.  King,  Physica  D  20  (1986) 
217. 

[38]  G.  King,  R.  Jones  and  D.S.  Broomhead,  Nucl.  Phys.  B 
(Proc.  Suppl.)  2  (1987)  379. 

[39]  J.D.  Farmer  and  J.  Sidorowich,  Phys.  Rev.  Lett.  59 
(1987)  8. 

[40]  J.  Crutchfield  and  B.S.  McNamara,  J.  Compl.  Syst.  1 
(1987)  417-452. 

[41  ]  D.S.  Broomhead  and  D.  Lowe,  J.  Compl.  Syst.  2  ( 1 988 ) 
321-355. 

[42]  A.L  Mees,  Modelling  complex  systems,  in:  Proc.  Conf. 
on  Modelling  Complex  Systems,  eds.  L.S.  Jennings,  A.L 
Mees  and  T.L.  Vincent  (Birkhaiiser,  1989). 

[43]  M.  Casdagli,  Physica  D  35  (1989)  335-356. 

[44]  N.H.  Packard,  J.  Compl.  Syst.  4  (1990)  543. 

[45]  A.S.  Weigend,  B.A.  Huberman  and  D.  E.  Rumelhart, 
Predicting  the  future:  A  connectionist  approach.  Intern. 
J.  Neural  Syst.  (1990),  submitted. 

[46]  G.  Sugihara  and  R.M.  May,  Nature  344  (1990)  734. 

[47]  H.  Tong,  Non-Linear  Time  Series  Analysis,  (Oxford 
Univ.  Press,  1990). 

[48]  K.  Stokbro,  Predicting  chaos  with  weighted  maps, 
Nordita  preprint  ( 1991 ). 

[49]  J.D.  Farmer  and  J.  Sidorowich,  in:  Evolution,  Learning, 
and  Cognition,  ed.  Y.C.  Lee  (World  Scientific,  1988) 
p.  277. 

[50]  M.J.D.  Powell,  Radial  basis  fiinaions  for  multivariate 
interpolation:  a  review,  in:  Proc.  IMA  Conf.  on 
Algorithms  for  the  Approximation  of  Functions  and 
Data  (RMCS  Shrivenham,  1985). 

[51  ]  C.A.  Michelli,  Constr.  Approx.  2  ( 1986)  1 1-22. 

[  52  ]  A.  Ameodo,  G.  Grasseau  and  E. J.  Kostelich,  Phys.  Lett. 
A  131  (1987)  426. 

[53]  J.  Guckenheimer  and  P.  Holmes,  Nonlinear 
Oscillations,  Dynamical  Systems  and  Bifurcations  of 
Vector  Fields,  Vol.  42  of  Applied  mathematical  sciences 
(Springer,  1983). 


[54]  E.  Stone,  Phys.  Lett.  A  148  (1990)  434-442. 

[55]  L.A.  Smith,  K.  Godfrey,  P.  Fox  and  K.  Warwick,  A  new 
technique  for  fault  detection  in  multi-sensor  probes,  in: 
Control  91  (1991)  p.  1062. 

[56]  L.A.  Smith  and  K.  Godfrey,  Nonlinear  methods  of 
fault  detection  in  multi-sensor  probes,  in  preparation 
(1992). 

[57]  P.  Grassberger,  T.  Schreiber  and  C.  Schaffrath,  Non¬ 
linear  time  sequence  analysis.  Preprint  WUB  91-14 
(1991). 

[58]  J.  Theiler,  J.  Opt.  Soc.  Am.  A7  (1990)  1055-1073. 

[59]  J.  Theiler,  Phys.  Rev.  A  34  (1986)  2427-2432. 

[60]  A.R.  Osborne  and  A.  Provenzale.  °hysicaD35  (1989) 
357-381. 

[61]  J.  Theiler,  Phys.  Lett.  A  155  (1991)  480-493. 

[62]  J.  Theiler,  Phys.  Rev.  A  36  (1987)  4456-4462. 

[63]  P.  Grassberger,  Phys.  Lett.  A  148  (1990)  63. 

[64]  R.  Badii  and  A.  Politi,  Phys.  Lett.  A  104  (1984)  303- 
305. 

[65]  L.A.  Smith,  J.-D.  Fournier  and  E.A.  Spiegel,  Phys.  Lett. 
A  114  (1986)  465. 

[66]  L.A.  Smith,  Phys.  Let*.  A  133  (1988)  283. 

[67]  D.  Ruelle,  Proc.  R.  Soc.  London  A  427  (1990)241-248. 

[68]  J.  Theiler,  Phys.  Rev.  A  41  (1990)  3038-3051. 

[69]  C.  Essex  and  M.  and  Nerenberg,  Proc.  R.  Soc.  London 
A  435  (1991)  287-292. 

[70]  K.  Judd  and  A.L  Mees,  Estimating  dimensions  with 
confidence,  Aust.  J.  Bif.  Chaos  (June  1991 ). 

[71  ]  G.U.  Yule,  Phil.  Trans.  R.  Soc.  London  A  226  (1927) 
267-298. 

[72]  G.E.  Uhlenbeck  and  L.S.  Omstein,  Phys.  Rev.  36 
(1930)  823-841. 

[73]  D.R.  Cox  and  H.D.  Miller,  The  theory  of  stochastic 
procesesses,  (Chapman  and  Hall,  New  York,  1965). 

[74]  M.  Gaster,  Proc.  R.  Soc.  London  A  430  (1990)  3-24. 

[75]  K.  Fraedrich,  C.  Ziehmann-Schlumbohm  and  L.A. 
Smith,  Estimating  state  dependent  predictability:  Some 
meteorological  applications,  Ann,  Geophys.  (1992), 
1 992  General  Assembly  Supplement  Volume. 


Physica  D  58  (1992)  77-94 
North-Holland 


Testing  for  nonlinearity  in  time  series: 
the  method  of  surrogate  data 

James  Theiler^' ^  Stephen  Eubank"  *’  Andre  Longtin"  ^  Bryan  Galdrikian"  ” 
and  J.  Doyne  Farmer" 

^Theoretical  Division,  Los  Alamos  National  Laboratory,  Los  Alamos,  NM  87545,  USA 
'"Center  for  Nonlinear  Studies,  Los  Alamos  National  Laboratory,  Los  Alamos,  NM  87545,  USA 
‘Santa  Fe  Institute,  1660  Old  Pecos  Trail,  Santa  Fe,  NM  87501,  USA 
"‘Prediction  Company,  234  Griffin  Street,  Santa  Fe,  NM  87501,  USA 

Received  4  October  1991 

Revised  manuscript  received  11  February  1992 

Accepted  3  March  1992 


We  describe  a  statistical  approach  for  identifying  nonlinearity  in  time  series.  The  method  first  specifies  some  linear 
process  as  a  null  hypothesis,  then  generates  surrogate  data  sets  which  are  consistent  with  this  null  hypothesis,  and  finally 
computes  a  discriminating  statistic  for  the  original  and  for  each  of  the  surrogate  data  sets.  If  the  value  computed  for  the 
original  data  is  significantly  different  than  the  ensemble  of  values  computed  for  the  surrogate  data,  then  the  null  hypothesis 
is  rejected  and  nonlinearity  is  detected.  We  discuss  various  null  hypotheses  and  discriminating  statistics.  The  method  is 
demonstrated  for  numerical  data  generated  by  known  chaotic  systems,  and  applied  to  a  number  of  experimental  time  series 
which  arise  in  the  measurement  of  superfluids,  brain  waves,  and  sunspots;  we  evaluate  the  statistical  significance  of  the 
evidence  for  nonlinear  structure  in  each  case,  and  illustrate  aspects  of  the  data  which  this  approach  identifies. 


1.  Introduction 

The  inverse  problem  for  a  nonlinear  system  is 
to  determine  the  underlying  dynamical  process  in 
the  practical  situation  where  all  that  is  available 
is  a  time  series  of  data.  Algorithms  have  been 
developed  which  can  in  principle  make  this  dis¬ 
tinction,  but  they  are  notoriously  unreliable,  and 
usually  involve  considerable  human  judgement. 
Particularly  for  experimental  data  sets,  which  are 
often  short  and  noisy,  simple  autocorrelation  can 
fool  dimension  and  Lyapunov  exponent  es¬ 
timators  into  signalling  chaos  where  there  is 
none.  Most  authors  agree  that  the  methods  con¬ 
tain  many  pitfalls,  but  it  is  not  always  easy  to 
avoid  them.  While  some  data  sets  very  cleanly 
exhibit  low-dimensional  chaos,  there  are  many 
cases  where  the  evidence  is  sketchy  and  difficult 


to  evaluate.  Indeed,  it  is  possible  for  one  author 
to  claim  evidence  for  chaos,  and  for  another  to 
argue  that  the  data  is  consistent  with  a  simpler 
explanation  [1-4]. 

The  real  complication  arises  because  low¬ 
dimensional  chaos  and  uncorrelated  noise  are 
not  the  only  available  alternatives.  The  erratic 
fluctuations  that  are  observed  in  an  experimental 
time  series  owe  their  dynamical  variation  to  a 
mix  of  various  influences:  chaos,  nonchaotic  but 
still  nonlinear  determinism,  linear  correlations, 
and  noise,  both  in  the  dynamics  and  in  the 
measuring  apparatus.  While  we  are  motivated  by 
the  prospect  of  ultimately  disentangling  these 
influences,  we  take  as  a  more  modest  goal  the 
detection  of  nonlinear  structure  in  a  stationary 
time  series.  (We  will  not  attempt  to  characterize 
non-stationary  time  series  -  see  refs.  [5-9]  for  a 
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discussion  of  some  of  the  problems  arising  in  the 
estimation  of  nonlinear  statistics  from  non¬ 
stationary  data.) 

Positive  identification  of  chaos  is  difficult;  the 
usual  way  to  detect  low-dimensional  behavior  is 
to  estimate  the  dimension  and  then  see  if  this 
value  is  small.  With  a  finite  time  series  of  noisy 
data,  the  dimension  estimated  by  the  algorithm 
will  at  best  be  approximate  and  often,  outright 
wrong.  One  can  guard  against  this  by  attempting 
to  identify  the  various  sources  of  error  (both 
systematic  and  statistical),  and  then  putting  error 
bars  on  the  estimate  (see,  for  example,  refs. 
[10-18]).  But  this  can  be  problematic  for  non¬ 
linear  algorithms  like  dimension  estimators:  first, 
assignment  of  error  bars  requires  some  model  of 
the  underlying  process,  and  that  is  exactly  what 
is  not  known;  further,  even  if  the  underlying 
process  were  known,  the  computation  of  an 
error  bar  may  be  analytically  difficult  if  not 
intractable. 

The  goal  of  detecting  nonlinearity  is  consider¬ 
ably  easier  than  that  of  positively  identifying 
chaotic  dynamics.  Our  approach  is  to  specify  a 
well-defined  underlying  linear  process  or  null 
hypothesis,  and  to  determine  the  distribution  of 
the  quantity  we  are  interested  in  (dimension, 
say)  for  an  ensemble  of  surrogate  data  sets  which 
are  just  different  realizations  of  the  hypothesized 
linear  stochastic  process.  Then,  rather  than  esti¬ 
mate  error  bars  on  the  dimension  of  the  original 
data,  we  put  error  bars  on  the  value  given  by  the 
surrogates.  This  can  be  done  reliably  because  we 
have  a  model  for  the  underlying  dynamics  (the 
null  hypothesis  itself),  and  because  we  have 
many  realizations  of  the  null  hypothesis,  we  can 
estimate  the  error  bar  numerically  (from  the 
standard  deviation  of  all  estimated  dimensions  of 
the  surrogate  data  sets)  and  avoid  the  issue  of 
analytical  tractibility  altogether. 

While  this  article  elaborates  on  preliminary 
work  described  in  an  earlier  publication  [19],  our 
aim  is  to  make  this  exposition  self-contained.  In 
section  2,  we  express  the  problem  of  detecting 
nonlinearity  in  terms  of  statistical  hypothesis 


testing.  We  introduce  a  measure  of  significance, 
develop  various  null  hypotheses  and  discriminat¬ 
ing  statistics,  and  describe  algorithms  for 
generating  surrogate  data.  Section  3  demon¬ 
strates  the  technique  for  several  computer-gener¬ 
ated  examples  under  a  variety  of  conditions; 
large  and  small  data  sets,  high  and  low-dimen¬ 
sional  attractors,  and  various  levels  of  observa¬ 
tional  and  dynamical  noise.  In  section  4,  we 
illustrate  the  application  of  the  method  to  several 
real  data  sets,  including  fluid  convection,  elec¬ 
troencephalograms  (EEG),  and  sunspots.  With 
real  data,  there  is  always  room  for  human  judg¬ 
ment,  and  we  argue  that  besides  formally  reject¬ 
ing  a  null  hypothesis,  the  method  of  surrogate 
data  can  also  be  useful  in  an  informal  way, 
providing  a  benchmark,  or  control  experiment, 
against  which  the  actual  data  can  be  compared. 

2.  Statistical  hypothesis  testing 

The  formal  application  of  the  method  of  surro¬ 
gate  data  is  expressed  in  the  language  of  statisti¬ 
cal  hypothesis  testing.  This  involves  two  ingredi¬ 
ents:  a  null  hypothesis  against  which  observa¬ 
tions  are  tested,  and  a  discriminating  statistic. 
The  null  hypothesis  is  a  potential  explanation 
that  we  seek  to  show  is  inadequate  for  explaining 
the  data;  and  the  discriminating  statistic  is  a 
number  which  quantifies  some  aspect  of  the  time 
series.  If  this  number  is  different  for  the  ob¬ 
served  data  than  would  be  expected  under  the 
null  hypothesis,  then  the  null  hypothesis  can  be 
rejected. 

It  is  possible  in  some  cases  to  derive  analytical¬ 
ly  the  distribution  of  a  given  statistic  under  a 
given  null  hypothesis,  and  this  approach  is  the 
basis  of  many  existing  tests  for  nonlinearity  (e.g., 
see  refs.  [20-26]).  In  the  method  of  surrogate 
data,  this  distribution  is  estimated  by  direct 
Monte  Carlo  simulation.  An  ensemble  of  surro¬ 
gate  data  sets  are  generated  which  share  given 
properties  of  the  observed  time  series  (such  as 
mean,  variance,  and  Fourier  spectrum)  but  are 
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Otherwise  random  as  specified  by  the  null  hy¬ 
pothesis.  For  each  surrogate  data  set,  the  dis¬ 
criminating  statistic  is  computed,  and  from  this 
ensemble  of  statistics,  the  distribution  is  approxi¬ 
mated. 

While  this  approach  can  be  computationally 
intensive,  it  avoids  the  analytical  derivations 
which  can  be  difficult  if  not  impossible.  This 
leads  to  increased  flexibility  in  the  choice  of  null 
hypotheses  and  discriminating  statistics;  in  par¬ 
ticular,  the  hypothesis  and  statistic  can  be  chosen 
independently  of  each  other.  The  method  of 
surrogate  data  is  basically  an  application  of  the 
“bootstrap”  method  of  modern  statistics.  These 
methods  have  by  now  achieved  widespread 
popularity  for  reasons  that  are  well  described  in 
Efron’s  1979  manifesto  [27].  A  more  recent  ref¬ 
erence,  which  applies  the  bootstrap  in  a  context 
very  similar  to  ours  is  by  Tsay  [28]. 

2.1.  Computing  significance 

Let  Qo  denote  the  statistic  computed  for  the 
original  time  series,  and  Q^^,  for  the  ith  surrogate 
generated  under  the  null  hypothesis.  Let  and 
o■^^  denote  the  (sample)  mean  and  standard  de¬ 
viation  of  the  distribution  of  0^. 

If  multiple  realizations  are  available  for  the 
observational  data,  then  it  may  be  possible  to 
compare  the  two  distributions  (observed  data 
and  surrogate)  directly,  using  for  instance  the 
Kolmogorov-Smirnov  or  Mann-Whitney  test, 
which  compare  the  full  distributions,  or  possibly 
a  Student-t  test  which  only  compares  their 
means.  For  the  present  purposes,  however,  we 
consider  that  only  one  experimental  data  set  is 
available*^',  and  we  use  a  kind  of  t  test. 

Of  course,  it  is  always  possible  to  create  several  realiza¬ 
tions  out  of  that  single  set  by  chopping  up  the  data;  we  have 
not  tried  this  approach,  but  just  as  the  convergence  of 
numerical  algorithms  like  correlation  dimension  and 
Lyapunov  exponent  estimation  are  compromised  by  shor¬ 
tened  data  sets,  so  we  suspect  will  be  their  power  to  reject  a 
null  hypothesis.  This  is  only  a  suspicion,  however;  it  would 
be  worthwhile  to  compare  the  relative  power  of  several  short 
data  sets  versus  that  of  one  long  data  set. 


We  define  our  measure  of  "significance”  by 
the  difference  between  the  original  and  the  mean 
surrogate  value  of  the  statistic,  divided  by  the 
standard  deviation  of  the  surrogate  values: 

The  significance  is  properly  a  dimensionless 
quantity,  but  it  is  natural  to  call  the  units  of  57 
“sigmas”.  If  the  distribution  of  the  statistic  is 
gaussian  (and  numerical  experiments  indicate 
that  this  is  often  a  reasonable  approximation), 
then  the  p-value  is  given  by  p  =  erfc(57/'\/2);  this 
is  the  probability  of  observing  a  significance  57  or 
larger  if  the  null  hypothesis  is  true. 

A  more  robust  way  to  define  significance 
would  be  directly  in  terms  of  p-values  with  rank 
statistics.  For  example,  if  the  observed  time 
series  has  a  statistic  which  is  in  the  lower  one 
precentile  of  all  the  surrogate  statistics  (and  at 
least  a  hundred  surrogates  would  be  needed  to 
make  this  determination),  then  a  (two-sided) 
p-value  of  p-0.02  could  be  quoted.  We  have 
used  eq.  (1)  for  the  investigations  reported  here 
because  the  computational  effort  in  that  case  is 
not  as  severe. 


2.1.1.  Estimating  error  bars  on  significance 
Our  plots  of  significance  include  error  bars; 
these  are  meant  only  as  a  rough  guide  and  are 
computed  assuming  that  the  statistics  are  distrib¬ 
uted  as  a  gaussian. 

We  write  the  error  bar  on  57  as  A57,  and  it  is 
computed  by  standard  propagation  of  errors 
methodology.  Here 


(fT=( 


MdI 


I  Mh  Md 

(^Mh)'  +  (^Mp) 
(  ~  Md  ) 


V  +  (^’ 

/  \  O-H  ^ 


(2) 


Now  the  error  of  the  sample  mean  based  on  N 
observations  is  given  by  (A^a)"  =  a'/N,  and  the 
error  of  the  sample  standard  deviation  is 
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(Ao-)^  =  so  we  can  write 

(/^h-Md)^  2A^h- 

The  absolute  error  bar  is  then  given  by 

Ay  =  V(1  +  +  (V‘^h)'/A^d  • 


(3) 

(4) 


When  only  a  single  realization  of  the  time 
series  is  available,  we  take  0-0=0  and  ignore  the 
second  term  in  the  above  equation.  This  reports 
the  error  bar  on  the  significance  of  the  specific 
realization. 

In  our  numerical  experiments,  we  use  several 
realizations  of  the  time  series  under  question. 
However,  the  significance  we  report  is  not  based 
on  the  collective  evidence  of  the  several,  but  is 
the  average  significance  of  each  realization  taken 
individually.  The  error  bar  in  that  case  describes 
the  expected  error  of  our  estimate  of  this  aver¬ 
age.  Note  that  this  differs  from  the  error  re¬ 
ported  for  a  single  realization. 


2.2.  Toward  a  hierarchy  of  null  hypotheses 


The  null  hypothesis  defines  the  nature  of  the 
candidate  process  which  may  or  may  not 
adequately  explain  the  data.  Our  null  hypotheses 
usually  specify  that  certain  properties  of  the 
original  data  are  preserved  -  such  as  mean  and 
variance  -  but  that  there  is  no  further  structure 
in  the  time  series.  The  surrogate  data  is  then 
generated  to  mimic  these  preserved  features  but 
to  otherwise  be  random.  There  is  some  latitude 
in  choosing  which  features  ought  to  be  pre¬ 
served;  certainly  mean  and  variance,  and  pos¬ 
sibly  also  the  Fourier  power  spectrum.  If  the  raw 
data  is  discretized  to  integer  values,  then  the 
surrogate  data  should  be  similarly  discretized. 

Ultimately  we  envision  a  hierarchy  of  null 
hypotheses  against  which  time  series  might  be 
compared.  Beginning  with  the  simplest  hypoth¬ 
eses,  and  increasing  in  generality,  the  following 
sections  outline  some  of  the  possibilities  that  we 
have  considered. 


2.2.1.  Temporally  independent  data 

The  first  (and  easiest)  question  to  answer 
about  a  time  series  is  whether  there  is  evidence 
for  any  dynamics  at  all.  The  null  hypothesis  in 
this  case  is  that  the  observed  data  is  fully  de¬ 
scribed  by  independent  and  identically  distribut¬ 
ed  (IID)  random  variables.  If  the  distribution  is 
further  assumed  to  be  gaussian,  then  surrogate 
data  can  be  readily  generated  from  a  standard 
pseudorandom  number  generator,  normalized  to 
the  mean  and  variance  of  the  original  data. 

To  test  the  hypothesis  of  IID  noise  with  arbi¬ 
trary  amplitude  distribution  in  an  analysis  of 
stock  market  returns.  Schienkman  and  LeBaron 
[29]  generated  surrogate  data  by  shuffling  the 
time-order  of  the  original  time  series.  The  surro¬ 
gate  data  is  obviously  guaranteed  to  have  the 
same  amplitude  distribution  as  the  original  data, 
but  any  temporal  correlations  that  may  have 
been  in  the  data  are  destroyed.  Breeden  and 
Packard  also  used  a  shuffling  process  along  with 
a  sophisticated  nonlinear  predictor  to  prove  that 
there  was  some  dynamical  structure  to  a  time 
series  of  quasar  data  which  were  sampled 
nonuniformly  in  time  [30]. 


2.2.2.  Ornstein-Uhlenbeck  process 

A  very  simple  case  of  non-IID  noise  is  given 
by  the  Ornstein-Uhlenbeck  process  [31].  A  dis¬ 
crete  sampling  of  this  process  yields  a  model  of 
the  form 


=  +  +  o-e,  , 


(5) 


where  e,  is  uncorrelated  gaussian  noise  of  unit 
variance.  The  coefficients  a„,  a,,  and  a  collec¬ 
tively  determine  the  mean,  variance,  and  auto¬ 
correlation  time  of  the  time  series.  In  fact,  the 
autocorrelation  function  is  exponential  in  this 
case. 


A{t)  = 


{x,x,^^)  -  {x,)- 

{x;)-{xy~ 


where  (  )  denotes  an  average  over  time  r,  and 
A  =  -log  a, . 


J.  Theiler  el  al.  /  The  meUiod  of  surrogate  data 


81 


To  make  surrogate  data  sets,  the  mean  /i, 
variance  v,  and  first  autocorrelation  /1(1)  are 
estimated  from  the  original  time  series;  from 
these  the  coefficients  are  fit:  a,  =  /4(l),  a„  = 
/x(l  -  a,),  and  cr‘  =  i;(l  -  a\).  Finally,  one  gen¬ 
erates  the  surrogate  data  by  iterating  eq.  (5), 
using  a  pseudorandom  number  generator  for  the 
unit  variance  gaussian  e,. 

2.2.3.  Linearly  autocorrelated  gaussian  noise 

We  can  generalize  the  above  null  hypothesis 
by  extending  eq.  (5)  to  arbitrary  order.  This 
leads  to  the  hypothesis  that  is  generally  associ¬ 
ated  with  linearity.  We  emphasize  that  we  are 
discussing  linear  gaussian  processes  here  (see 
Tong  [26,  pp.  13,  14]  for  a  brief  description  of 
some  of  the  surprising  properties  of  linear  non- 
gaussian  processes);  Section  2.2.4  describes  one 
approach  toward  a  nongaussian  null  hypothesis. 
The  model  is  described  by  fitting  coefficients  a,^ 
and  <r  to  a  process 

9 

x,  =  a„+'Z  a^x,_^  +  (re, ,  (7) 

*  =  i 

which  mimics  the  original  time  series  in  terms  of 
mean,  variance,  and  the  autocorrelation  function 
for  delays  of  t  =  I, .  .  .  ,  q.  This  is  an  auto¬ 
regressive  (AR)  model;  a  more  general  model 
includes  a  moving  average  (MA)  of  time  delayed 
noise  terms  as  well,  and  the  combination  is 
called  an  ARMA  model.  For  large  enough  q,  the 
models  are  essentially  equivalent.  The  null  hy¬ 
pothesis  in  this  case  is  that  all  the  structure  in  the 
time  series  is  given  by  the  autocorrelation  func¬ 
tion,  or  equivalently,  by  the  Fourier  power 
spectrum. 

One  algorithm  for  generating  surrogate  data 
under  this  null  hypothesis  is  again  to  iterate  eq. 
(7),  where  the  coefficients  have  been  fit  to  the 
original  data.  We  describe  an  alternative  al¬ 
gorithm  in  section  2.4.1  which  involves  ran¬ 
domizing  the  phases  of  a  Fourier  transform.  (To 
our  knowledge,  this  algorithm  was  first  suggested 
in  this  context  by  Osborne  et  al.  [5],  and  in¬ 


dependently  in  refs.  [15,  32|.)  The  alternative 
algorithm  generates  surrogate  data  which  by  con¬ 
struction  has  the  same  Fourier  spectrum  as  the 
original  data.  While  the  two  algorithms  arc  es¬ 
sentially  equivalent,  we  use  the  Fourier  trans¬ 
form  method  because  it  is  numerically  stabler.  If 
the  values  of  the  coefficients  in  eq.  (7)  are 
mis-estimated  slightly,  it  is  possible  that  iterating 
the  equation  will  lead  to  a  time  series  which 
diverges  to  infinity;  this  is  particularly  prob¬ 
lematic  if  the  raw  time  series  is  nearly  periodic  or 
highly  sampled  continuous  data. 

We  remark  that  this  is  the  null  hypothesis  that 
is  associated  with  residual-based  tests  for  non¬ 
linearity.  For  instance,  see  refs.  [22-24,  33,  .34]. 
In  these  tests,  a  model  of  the  form  of  eq.  (7)  is 
fit  to  the  data,  and  the  residuals 

=  -(«()+  S  (8) 

are  tested  against  a  null  of  temporally  indepen¬ 
dent  noise.  In  ref.  [19],  we  argue  that  it  is  usually 
preferable  to  use  the  method  of  surrogate  data 
on  the  raw  data  directly,  rather  than  working 
with  residuals. 

2.2.4.  Static  nonlinear  transform  of  linear 
gaussian  noise 

One  way  to  generalize  the  above  null  hypoth¬ 
esis  to  cases  where  the  data  is  nongaussian  is  to 
supfjose  that  although  the  dynamics  is  linear,  the 
observation  function  may  be  nonlinear.  In  par¬ 
ticular,  we  hypothesize  that  there  exists  an  “un¬ 
derlying”  time  series  {y,}.  consistent  with  the 
null  hypothesis  of  linear  gaussian  noise,  and  an 
observed  time  series  {.r,}  given  by 

x,  =  h{y,).  (9) 

Since  x,  depends  only  on  the  current  value  of  y, 
and  not  on  derivatives  or  past  values,  the  filter 
h{  )  is  said  to  be  “static"  or  “instantaneous”.  To 
permit  the  generation  of  surrogate  data,  we  must 
further  assume  (as  part  of  the  null  hypothesis; 
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that  the  observation  function  h{ )  is  effectively 
invertible. 

In  section  2.4.3,  an  algorithm  for  generating 
surrogate  data  corresponding  to  this  null  hypoth¬ 
esis  is  described.  Its  effect  is  to  shuffle  the  time- 
order  of  the  data  but  in  such  a  way  as  to  preserve 
the  linear  correlations  of  the  underlying  time 
series  y,  =  h~\x,).  One  advantage  of  shuffling 
over,  for  example,  a  smooth  fit  to  the  function 
/i(  ),  is  that  any  discretization  that  was  present  in 
the  original  data  will  be  reflected  in  the  surrogate 
data. 

Note  that  time  series  in  this  class  are  strictly 
speaking  nonlinear,  but  that  the  nonlinearity  is 
not  in  the  dynamics.  Most  conventional  tests  for 
nonlinearity  would  (correctly)  conclude  that  the 
time  series  is  nonlinear,  but  would  not  indicate 
whether  the  nonlinearity  was  in  the  dynamics  or 
in  the  amplitude  distribution.  By  using  sur-ogate 
data  tailored  to  this  specific  null  hypothesis,  it 
becomes  possible  to  make  such  fine  distinctions 
about  the  nature  of  the  dynamics. 


2.2.5.  More  general  null  hypotheses 

Eventually,  we  would  like  to  extend  this  list  to 
consider  more  general  cases.  A  natural  next  step 
is  a  null  hypothesis  that  the  dynamics  is  a  noisy 
limit  cycle.  Such  time  series  cannot  be  described 
by  a  linear  process,  even  if  viewed  through  a 
static  nonlinear  transform.  Yet  it  is  often  of  great 
interest,  particularly  in  systems  driven  by  season¬ 
al  cycles,  to  determine  the  nature  of  the  inter- 
seasonal  variation. 

There  is  another  large  class  of  nonlinear  sto¬ 
chastic  processes  which  are  not  predictable  even 
in  the  mean;  among  these  are  the  conditional 
heteroscedastic  models  (for  which  the  variance  is 
conditioned  on  the  past,  but  not  the  mean)  in 
favor  among  economists.  While  there  is  definite 
nonlinear  structure  in  these  time  series,  it  is  not 
manifested  in  enhanced  predictability  by  non¬ 
linear  models.  (For  instance,  it  may  be  possible 
to  predict  the  magnitude  [x,]  from  past  values  of 
X,  but  not  the  sign.) 


(a) 

(b> 

(c) 

(d) 

(e) 

(f) 

(g) 

(h) 


Fig.  1.  Shown  is  a  time  series  from  the  Mackey-Glass  equa¬ 
tion  with  T  =  30.  which  is  known  to  be  low-dimensional  and 
chaotic,  and  seven  surrogate  time  series  generated  by  the 
WFT  algorithm.  It  is  often  not  obvious  by  eye  which  is  the 
actual  data  set  and  which  are  the  surrogates.  In  this  case  it  is 
series  (f)  which  is  the  real  one. 


2.3.  Battery  of  discriminating  statistics 

The  method  of  surrogate  data  can  in  principle 
be  used  with  virtually  any  discriminating  statistic. 
Formally,  all  that  is  required  to  reject  a  null 
hypothesis  is  that  the  statistic  have  a  different 
distribution  for  the  data  than  for  the  surrogates. 
However,  the  method  is  more  useful  if  the  statis¬ 
tic  actually  provides  a  good  estimate  of  a  phys¬ 
ically  interesting  quantity;  in  that  case,  one  may 
not  only  formally  reject  a  null  hypothesis,  but 
also  informally  characterize  the  nature  of  the 
nonlinearity. 

Since  we  were  motivated  by  the  possibility  that 
the  underlying  dynamics  may  be  chaotic,  our 
original  choices  for  discriminating  statistics  were 
the  correlation  dimension,  Lyapunov  exponent, 
and  forecasting  error.  Ideally,  dimension  counts 
degrees  of  freedom,  Lyapunov  exponent  quan¬ 
tifies  the  sensitivity  to  initial  conditions,  and 
forecasting  error  tests  for  determinism.  One  of 
the  ultimate  aims  in  this  project  is  to  understand 
the  conditions  in  which  one  or  the  other  of  these 
methods  will  be  more  effective. 

We  should  remark  that  a  danger  in  using  a 
battery  of  statistics  is  that  one  of  them,  by 
chance,  will  show  up  as  significant.  This  effect 
can  be  formally  accounted  for  by  keeping  strict 
count  of  the  number  of  tests  used,  and  increasing 
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the  threshold  of  significance  accordingly.  The 
formal  approach  tends  to  be  more  conservative 
than  necessary,  since  the  tests  are  not  really 
independent  of  each  other,  but  it  is  still  a  recom¬ 
mended  practice  to  maintain  a  reasonably  high 
threshold  of  significance. 

2.3.1.  Correlation  dime ny on,  v 

Dimension  is  an  exponent  which  characterizes 
the  scaling  of  some  bulk  measure  with  linear 
size.  A  number  of  algorithms  are  available  (17, 
35]  for  estimating  the  dimension  of  an  underlying 
strange  attractor  from  a  time  series;  we  chose  a 
box-assisted  variation  [36]  (see  Grassberger  [37] 
for  an  elegant  alternative)  of  the  Grassberger- 
Procaccia-Takens  algorithm  [38-40]  to  compute 
a  correlation  integral,  and  the  best  estimator  of 
Takens  [12]  for  the  dimension  itself.  To  compute 
a  dimension,  it  is  necessary  to  choose  some 
range  of  sizes  over  which  the  scaling  is  to  be 
estimated.  The  Takens  estimator  requires  only 
an  upper  cutoff  size;  we  used  one-half  of  the  rms 
variation  in  the  time  series  for  this  value.  (See 
Ellner  [41]  for  an  estimator  that  takes  both  an 
upper  and  a  lower  cutoff.) 

We  will  concede  that  this  choice  is  a  bit  arbi¬ 
trary;  one  might  prefer  a  more  sophisticated 
algorithm  for  choosing  a  good  scaling  range.  L. 
Smith  (personal  communication)  has  suggested 
choosing  the  range  “by  eye”  for  the  raw  data 
and  then  keeping  this  range  for  the  surrogates. 
From  the  point  of  view  of  the  formal  test,  it  does 
not  really  matter,  but  if  we  are  to  ask  for  insight 
as  well  as  a  rejected  null,  then  it  is  important  to 
use  a  good  dimension  estimator.  In  the 
limit,  the  estimator  we  describe  will  not  converge 
to  the  actual  precise  dimension  of  the  attractor, 
but  we  note  that  it  will  converge  fairly  rapidly  to 
a  number  which  is  often  reasonably  close  to 
actual  dimension  (of  course,  one  can  always 
contrive  counterexamples);  in  particular,  it  will 
properly  indicate  low-dimensionality  when  it  sees 
it.  While  we  do  not  claim  that  this  is  the  optimal 
dimension  estimator,  we  believe  that  it  is  a  use¬ 
ful  one. 


2.3.2.  Forecasting  error,  e 

A  system  is  deterministic  if  its  future  can  be 
predicted.  A  natural  statistic  in  this  case  is  some 
average  of  the  forecasting  errors  obtained  from 
nonlinear  modeling.  The  method  we  use  entails 
first  splitting  the  time  series  into  a  fitting  set  of 
length  Nf,  and  a  testing  set  of  length  A,,  with 
Nf+N^  =  N.  the  length  of  the  time  series;  then 
fitting  a  local  linear  model  [42]  to  the  fitting  set, 
locality  given  by  the  number  of  neighbors  k :  and 
finally,  using  this  model  to  forecast  the  values  in 
the  testing  set,  and  comparing  them  with  the 
actual  values. 

The  prediction  error  e,  =  x,  -  x,  is  the  differ¬ 
ence  between  the  actual  value  of  x  and  the 
predicted  value,  i;  we  define  our  discriminating 
statistic  as  the  log  median  absolute  prediction 
error. 

Several  modeling  parameters  must  be  chosen, 
including  the  partitioning  of  the  data  set  into 
fitting  (Nf)  and  testing  (A,)  segments,  the  num¬ 
ber  of  steps  ahead  to  predict  (T),  and  number  of 
neighbors  (^)  used  in  the  local  linear  fit.  We 
arbitrarily  chose  to  divide  the  fitting  and  testing 
sets  equally,  with  A,  =  A,  =  \N,  and  to  predict 
one  step  ahead,  so  7'=  1.  For  oversampled  con¬ 
tinuous  data,  a  larger  T  would  be  more  appropri¬ 
ate.  The  choice  of  k  is  also  important.  For  the 
results  in  this  article,  we  set  k  =  2m.  which  is 
twice  the  minimum  number  needed  for  a  fit,  but 
we  note  that  this  is  often  not  optimal.  Indeed, 
Casdagli  [43,  44]  has  advocated  sweeping  the 
parameter  A:  in  a  local  linear  forecaster  as  an 
exploratory  method  to  look  for  nonlinearity  in 
the  first  place. 

2.3.3.  Estimated  Lyapunov  exponent.  A 

Following  standard  practices  [45-47],  we  com¬ 
pute  Lyapunov  exponents  by  multiplying  Jaco¬ 
bian  matrices  along  a  trajectory,  with  the  mat¬ 
rices  computed  by  local  linear  fits,  and  we  use 
OR  decomposition  to  maintain  orthogonality. 

We  have  found  that  numerical  estimation  of 
even  the  largest  Lyapunov  exponent  can  be 
problematic  in  the  presence  of  noise.  Indeed,  for 
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our  surrogate  data  sets,  for  which  the  linear 
dynamics  is  contracting,  we  often  obtain  positive 
Lyapunov  exponents.  This  indicates  that  our 
Lyapunov  exponent  estimator  (which,  as  we 
have  described,  is  fairly  standard)  is  seriously 
flawed,  something  we  might  not  have  noticed 
had  we  not  tested  with  linear  stochastic  time 
series.  We  are  aware  of  at  least  one  group  whose 
Lyapunov  exponent  estimator  explicitly  consid¬ 
ers  the  effects  of  noise  [48-50].  While  our  es¬ 
timator  is  arguably  still  useful  as  a  statistic  which 
formally  distinguishes  original  data  from  surro¬ 
gate  data,  it  would  be  better  to  use  a  discriminat¬ 
ing  statistic  which  correctly  quantifies  some  fea¬ 
ture  of  the  dynamics.  For  that  reason,  we  have 
avoided  using  the  Lyapunov  exponent  estimator 
in  this  article. 

2.3.4.  Other  discriminating  statistics 

We  have  found  that  using  the  correlation  inte¬ 
gral  (C(r)  for  some  value  of  r)  directly  as  a 
discriminating  statistic  generally  provides  a  more 
powerful  discrimination  than  the  estimated  di¬ 
mension  itself,  but  of  course  it  is  less  useful  as  an 
informal  tool.  L.  Smith  (personal  communica¬ 
tion)  has  suggested  using  a  statistic  which  charac¬ 
terizes  the  linearity  of  a  log  C(r)  versus  log  r 
curve.  We  have  also  considered  but  not  im¬ 
plemented  two-sided  forecasting  -  predicting  x, 
from  the  past  and  future  values:  jt, 
jc, +  ,  instead  of  the  usual  forecasting  which 

uses  only  the  past  (this  was  inspired  by  the 
simple  noise  reduction  technique  suggested  by 
Schreiber  and  Grassberger  [51]).  In  our  forecast¬ 
ing,  we  are  careful  to  distinguish  the  “training” 
set  from  the  “testing”  set,  so  that  the  forecasting 
statistic  is  an  out-of-sample  error;  but  the  in- 
sample  fitting  error  may  also  suffice  as  a  dis¬ 
criminating  statistic.  We  have  found  that  the 
BDS  test  [33],  which  was  designed  to  test  for  any 
temporal  correlation  at  all  -  linear  or  nonlinear, 
can  readily  be  extended  to  test  other  null  hy¬ 
potheses;  we  use  the  same  statistic,  but  we  do 
not  pre-whiten  the  data,  and  instead  of  relying 
on  an  analytical  derivation  of  the  distribution 


function,  we  use  surrogate  data.  Higher  and 
cross  moments  provide  another  class  of  dis¬ 
criminating  statistic;  in  fact,  many  of  these  are 
the  basis  of  traditional  tests  for  nonlinearity  in  a 
time  series  (e.g..  see  refs.  [22-24]).  We  have 
found  that  a  simple  skewed  difference  statistic, 
defined  by  Q  =  -  xf)  /  -x,)'),  is 

both  rapidly  computable  and  often  quite  power¬ 
ful.  Informally,  this  statistic  indicates  the  asym¬ 
metry  between  rise  and  fall  times  in  the  time 
series.  The  most  direct  example  we  know  is  due 
to  Brock,  Lakonishok,  and  LeBaron  ]52],  who 
used  technical  trading  rules  as  discriminating 
statistics  for  financial  data;  here  there  is  no 
difficulty  interpreting  the  informal  meaning  of 
the  statistic:  it  is  how  much  money  you  should 
have  made  using  that  rule  in  that  market. 

2.4.  Algorithms  for  generating  surrogate  data 

In  this  section,  we  describe  algorithms  we  use 
for  generating  surrogate  data.  The  first  two  are 
consistent  with  the  hypothesis  of  linearly  corre¬ 
lated  noise  described  in  section  2.2.3.  and  the 
third  adjusts  for  the  possibility  of  a  static  non¬ 
linear  transform  as  discussed  in  section  2.2.4. 

2.4.1.  Unwindowed  Fourier  transform  (FT) 
algorithm 

This  algorithm  is  based  on  the  null  hypothesis 
that  the  data  come  from  a  linear  gaussian  pro¬ 
cess.  The  surrogate  data  are  constructed  to  have 
the  same  Fourier  spectra  as  the  raw  data.  The 
algorithm  is  described  in  more  detail  in  ref.  [19]. 
but  we  briefly  note  the  main  features.  First,  the 
Fourier  transform  is  computed  for  positive  and 
negative  frequencies  /  =  0,  MN,  UN, ....  1/2. 
and  without  the  benefit  of  windowing.  Although 
windowing  is  generally  recommended  when  it  is 
the  power  spectrum  which  is  of  ultimate  interest 
[53],  we  originally  chose  not  to  use  windowing 
because  what  we  wanted  was  for  the  real  and 
surrogate  data  to  have  the  same  power  spectrum: 
we  were  not  concerned  with  the  spectrum,  per 
se.  The  Fourier  transform  has  a  complex  am- 
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plitude  at  each  frequency;  to  randomize  the 
phases,  we  multiply  each  complex  amplitude  by 
e"**,  where  <{>  is  independently  chosen  for  each 
frequency  from  the  interval  [0,2it].  In  order  for 
the  inverse  Fourier  transform  to  be  real  (no 
imaginary  components),  we  must  symmetrize  the 
phases,  so  that  </>(/)  =  -</>(-/).  Finally,  the  in¬ 
verse  Fourier  transform  is  the  surrogate  data. 

One  limitation  of  this  algorithm  is  that  it  does 
not  reproduce  “pure”  frequencies  very  well. 
What  happens  is  that  nearby  frequencies  in 
Fourier  space  are  “contaminated”  and  then  be¬ 
cause  their  phases  are  randomized,  they  end  up 
“beating”  against  each  other  and  producing 
spurious  low-frequency  effects.  (We  are  grateful 
to  S.  Ellner  for  pointing  this  out  to  us.)  This  may 
not  be  too  surprising  since  it  is  difficult  to  make  a 
linear  stochastic  process  with  a  long  coherence 
time.  Put  another  way,  the  time  series  should  not 
only  be  much  larger  than  the  dominant 
periodicities  but  also  much  longer  than  the 
coherence  time  of  any  given  frequency  if  one  is 
to  try  and  model  it  with  a  linear  stochastic 
process. 

A  second  problem,  which  is  most  evident  for 
highly  sampled  continuous  data,  is  that  spurious 
high  frequencies  can  be  introduced.  This  can  be 
understood  as  an  artifact  of  the  Fourier  trans¬ 
form  which  assumes  the  time  series  is  periodic 
with  period  N.  This  means  that  there  is  a  jump- 
discontinuity  from  the  last  to  the  first  point.  We 
recommend  tailoring  the  length  N  of  the  data  set 
so  that  jt[0]  =  Ar[A  -  1].  This  should  not  be  a 
problem  if  the  time  series  is  stationary  and  much 
longer  than  its  dominant  frequency.  We  have 
done  this  for  the  experimental  results  in  this 
article. 

2.4.2.  Windowed  Fourier  transform  {WFT) 
algorithm 

The  problem  of  spurious  high  frequencies  can 
also  be  addressed  by  windowing  the  data  before 
taking  the  Fourier  transform.  The  time  series  is 
multiplied  by  a  function  w(/)  =  sin(irt//V)  which 
vanishes  at  the  endpoints  t  -  0  and  t  =  N.  This 


suppresses  the  jump  discontinuity  from  the  last 
to  the  first  point,  and  seems  to  effectively  get  rid 
of  the  high  frequency  effect.  However,  it  also 
introduces  a  spurious  low-frequency  from  the 
power  spectrum  of  w(t)  itself.  We  have  done 
experiments  where  we  simply  set  the  magnitude 
of  the  offending  frequency  (/  =  l/yV)  to  zero; 
this  seems  to  work  well  for  stationary  time 
series,  but  if  there  is  significant  power  at  that 
frequency  in  the  original  data,  it  too  will  be 
suppressed. 

2.4.3.  Amplitude  adjusted  Fourier  transform 
{A AFT)  algorithm 

The  algorithm  in  this  section  generates  surro¬ 
gate  data  sets  associated  with  the  null  hypothesis 
in  section  2.2.4,  that  the  observed  time  series  is  a 
monotonic  nonlinear  transformation  of  a  linear 
gaussian  process.  The  idea  is  to  first  rescale  the 
values  in  the  original  time  series  so  they  are 
gaussian.  Then  the  FT  or  WFT  algorithm  can  be 
used  to  make  surrogate  time  series  which  have 
the  same  Fourier  spectrum  as  the  rescaled  data. 
Finally,  the  gaussian  surrogate  is  then  rescaled 
back  to  have  the  amplitude  distribution  as  the 
original  time  series. 

Denote  the  original  time  series  by  x[f].  with 
r  =  0, .  .  .  ,  A  -  1 .  The  first  step  is  to  make  a 
gaussian  time  series  yft],  where  each  element  is 
generated  independently  from  a  gaussian  pseu¬ 
dorandom  number  generator.  Next,  we  re-order 
the  time  sequence  of  the  gaussian  time  series  so 
that  the  ranks  of  both  time  series  agree;  that  is, 
if  jc[r]  is  the  nth  smallest  of  all  the  x’s,  then  y[t] 
will  be  the  nth  smallest  of  all  the  y’s.  Therefore, 
the  re-ordered  y[t]  is  a  time  series  which  "fol¬ 
lows”  the  original  time  series  j:[r]  and  which  has 
a  gaussian  amplitude  distribution.  Using  the  FT 
or  WFT  algorithm,  a  surrogate,  call  it  y'[f|,  of 
the  gaussian  time  series  can  be  created.  If  the 
original  time  series  x[f]  is  time  re-ordered  so  that 
it  follows  y'[t]  in  the  sense  that  the  ranks  agree, 
then  the  time-re-ordered  time  series  provides  a 
surrogate  of  the  original  time  series  which 
matches  its  amplitude  distribution.  Further,  the 
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“underlying”  time  series  (yfr]  and  are 

gaussian  and  have  the  same  Fourier  power 
spectrum. 

3.  Experiments  with  numerical  time  series 

To  properly  gauge  the  utility  of  the  surrogate 
data  approach  will  eventually  require  many  tests 
with  data  from  both  numerical  and  laboratory 
experiments.  In  this  section  we  illustrate  several 
aspects  of  the  method  with  data  whose  underly¬ 
ing  dynamics  is  known.  In  the  next  section,  we 
consider  several  examples  with  real  data. 

3.1.  Linear  gaussian  data 


First,  we  note  that  a  time  series  which  actually 
is  generated  by  a  linear  process  should  by  con¬ 
struction  give  a  negative  result  (that  is,  the  null 
hypothesis  should  not  be  rejected).  We  checked 
this  by  generating  some  time  series  with  two 
simple  linear  processes,  a  moving  average 

ac,  =  e, +  ae,_,  (10) 

and  an  autoregressive 

X,  =  ax,_, -t-e, .  (11) 

We  used  an  embedding  dimension  m  =  3  and 
computed  correlation  dimension  from  N  =  4096 
points.  The  “correct”  dimension  for  both  pro¬ 
cesses  is  V  —  m-3,  though  as  fig.  2  shows,  the 
estimates  were  always  biased  low.  The  bias  in¬ 
creases  for  data  which  are  more  highly  autocor- 
related  (|a|  larger)  but  the  point  we  wish  to  make 
is  that  the  bias  is  the  same  for  the  original  data 
and  for  the  surrogates.  The  null  hypothesis  is  not 
rejected. 


a 


Fig.  2.  Significance  of  evidence  for  nonlinearity  for  line 
gaussian  time  series  generated  by  (a)  a  moving  averii^je 
process,  and  (b)  an  autoregressive  process.  The  coeffic.eiit  in 
each  case  is  a.  The  estimated  dimension  is  show  '  jr  five 
realizations  of  the  linear  process  (□)  and  thirty  eulizations 
of  surrogate  data  (  +  ).  Note  that  the  dimen- lon  does  not 
distinguish  the  original  from  the  surrogate  data.  The  value  we 
obtain  for  significance  is  shown  in  the  lower  panels  and  in 
neither  case  is  significant. 


3.2.  Variation  with  number  of  data  points  and  series  increases  the  significance  with  which  non- 

complexity  of  attractor  linearity  can  be  detected  in  a  time  series  that  is 

known  to  be  chaotic;  and  increasing  the  com- 
Using  the  FT  algorithm,  we  showed  in  ref.  [19]  plexity  of  the  chac^tic  time  series  decreases  the 

that  increasing  the  number  of  points  in  a  time  ability  to  distinguish  from  linearity.  This  basic 
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point  is  illustrated  again  here  using  the  AAFT 
algorithm  (see  fig.  3);  while  the  significance  is 
not  as  large  using  this  more  general  null  hypoth¬ 
esis,  the  qualitative  behavior  is  the  same.  Time 
series  are  generated  by  summing  n  independent 
trajectories  of  the  Henon  map  [54];  such  time 
series  will  have  a  dimension  nv  where  i'  *=  1.2  is 


Fig.  3.  Using  the  AAFT  algorithm  to  generate  surrogate 
data,  the  significance  as  a  function  of  the  number  N  of  data 
points  is  computed  for  time  series  obtained  by  summing  n 
independent  trajectories  of  the  Henon  map.  For  both  (a) 
correlation  dimension  and  (b)  forecasting  error,  the  signifi¬ 
cance  increases  with  number  of  data  points  and  decreases 
with  the  complexity  of  the  system. 


the  dimension  of  a  single  Henon  trajectory.  For 
the  largest  data  sets,  with  N  =  8192  points,  our 
dimension  estimator  obtained  correlation  dimen¬ 
sions  of  1.215  ±0.008,  2.279  ±0.014,  3.48  ± 
0.02,  and  4.81  ±0.06  using  embedding  dimen¬ 
sions  m  =  3,  4,  5,  and  6,  for  «  =  1.  2,  3,  and  4. 
respectively. 

3.3.  Effect  of  observational  and  dynamical  noise 

To  test  whether  nonlinear  determinism  can  be 
detected  even  when  it  is  mixed  with  noise,  we 
added  both  dynamical  (tj)  and  observational  (e> 
noise  to  the  cosine  map:  y,  =  A  cos(Tr>',_, )  +  tj,; 

=  y,  +  f,.  We  chose  a  value  A  =  2.8  which  is  in 
the  chaotic  regime  when  the  external  noise  is 
zero.  (The  cosine  map  was  used  instead  of  the 
Henon  map  because  it  does  not  “blow  up”  in  the 
presence  of  too  much  dynamical  noise.)  In  fig.  4. 


(a) 


(b) 


noise  amplitude 


Fig.  4.  Effect  of  noise  on  significance  for  a  short  time  series 
of  N  =  512  points,  derived  from  the  cosine  map  with  A  =  2.8: 
(a)  observational  noise;  (b)  dynamical  noise. 
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we  plot  significance  as  a  function  of  noise  level 
for  both  dynamical  and  observational  noise.  As 
expected,  significance  decreases  with  increasing 
noise  level,  though  we  remark  that  the  non¬ 
linearity  is  still  observable  even  with  consider¬ 
able  noise.  In  the  absence  of  noise,  the  rms 
amplitude  of  the  signal  is  0.36;  thus  we  are  able 
to  detect  significant  nonlinearity  even  with  a 
signal  to  noise  ratio  of  one,  using  a  time  series  of 
length  N  =  512.  We  also  note  that  the  decrease  in 
significance  with  increased  dynamical  noise  is  not 
always  monotonic;  low  levels  of  dynamical  noise 
can  make  the  nonlinearity  more  evident. 

3.4.  Continuous  data 


In  most  experiments,  data  is  better  described 
as  a  flow  than  a  map.  Although  there  is  a  formal 
equivalence,  data  which  arise  from  processes 
that  are  continuous  in  time  are  often  sampled  at 
a  much  faster  rate  than  is  characteristic  of  the 
underlying  dynamics.  For  these  data  sets,  the 
effects  of  autocorelation  can  be  quite  large,  and 
the  importance  of  testing  against  a  null  hypoth¬ 
esis  that  includes  autocorrelation  becomes 
paramount. 

We  illustrate  this  point  with  numerical  experi¬ 
ments  on  data  obtained  from  the  Mackey-Glass 
differential  delay  equation  [55] 


dx 

dr 


-  bx{t)  + 


ax{t  -  t) 


(12) 


with  a  =  0.2,  6  =  0.1,  and  t  =  30.  Grassberger 
and  Procaccia  [39]  compute  a  correlation  dimen¬ 
sion  of  3.0  for  these  parameters. 


Fig.  5.  (a)  Correlation  integral  C{N,  r)  for  /V  =  4096  points 
and  embedding  dimensions  m  =  3. ....  19  from  oversampled 
Mackey-Glass  data,  (b)  Estimated  correlation  dimension 
according  to  Takens  estimator  as  a  function  of  cutoff  r. 


3.4.1.  A  poor  embedding 
We  oversample  the  data  (Af  =  0.1)  and  use  a 
deliberately  poor  embedding  strategy  -  straight 
time-delay  coordinates  with  a  lag  time  of  one 
sample  period.  We  estimate  correlation  dimen¬ 
sion  with  N  =  4096  points  and  compute  distances 
between  all  pairs  of  distinct  vectors  (despite  the 
advice  in  refs.  [2,  56]).  Fig.  5  shows  the  correla¬ 


tion  integral  and  estimated  dimension  as  a  func¬ 
tion  of  the  upper  cutoff  value  R.  There  is  about  a 
decade  of  roughly  constant  slope,  which  might 
be  taken  to  indicate  convergence  to  a  low  corre¬ 
lation  dimension. 

For  this  example,  the  dimension  statistic  was 
computed  as  the  Takens  best  estimator  [12]  at  an 
upper  cutoff  of  R  =  0.02.  (For  comparison,  the 
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m 
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Fig.  6.  (a)  Estimated  correlation  dimension  versus  embed¬ 
ding  dimension  for  oversampled  (At  =  0.1)  Mackey-Glass 
data  (□)  and  for  surrogates  generated  using  the  WFT  al¬ 
gorithm  (-H).  (b)  Significance  of  nonlinearity  in  no  case 
exceeds  three  sigmas. 


RMS  value  for  this  time  series  is  =  0.25.) 
Fig.  6a  shows  an  apparent  convergence  of  the 
estimated  dimension  as  a  function  of  embedding 
dimension.  A  naive  interpretation  of  this  figure 
is  that  the  time  series  arises  from  a  low-dimen¬ 
sional  strange  attractor.  However,  as  fig.  6a 
shows,  the  surrogate  data  also  converge  to  a  low 
dimension;  the  convergence  is  evidently  an  arti¬ 


fact  of  the  autocorrelation.  Indeed,  fig.  6b  shows 
that  the  dimension  statistic  in  this  case  does  not 
even  provide  evidence  for  nonlinearity. 

3.4.2.  A  belter  embedding 

From  the  same  Mackey-Glass  process,  we 
recompute  correlation  dimension  and  the  signifi- 


m 


Fig.  7.  Same  as  previous  figure,  except  that  a  better  embed¬ 
ding  and  a  better  algorithm  were  used  for  estimating  the 
dimension.  Not  only  is  the  evidence  for  nonlinearity  extreme¬ 
ly  significant  in  this  case,  but  it  is  also  evident  that  the 
process  is  low-dimensional. 
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cance  of  evidence  for  nonlinearity  using  a  better 
(though  probably  still  not  optimal)  choice  of 
embedding.  We  sample  at  a  much  lower  rate, 
At  =  3.0,  and  again  use  straight  delay  coordinates 
with  lag  time  of  one  sample  period.  We  estimate 
the  correlation  dimension  as  described  in  section 
2.3.1  with  N  =  40%  points,  and  we  avoid  pairs  of 
points  which  are  closer  together  in  time  than  one 
hundred  sample  periods.  In  fig.  7,  we  see  that 
the  evidence  for  nonlinearity  is  extremely  signifi¬ 
cant.  Indeed,  we  also  see  positive  evidence  of 
low-dimensional  behavior  (the  estimated  dimen¬ 
sion  V  converges  with  m)  which  we  know  is  not 
an  artifact  of  autocorrelation. 


4.  Examples  with  real  data 

We  repKJrt  some  results  on  experimental  time 
series  from  several  sources.  These  results  should 
be  taken  as  illustrative,  and  not  necessarily  typi¬ 
cal  of  the  class  which  they  represent.  In  particu¬ 
lar,  we  have  not  yet  attempted  to  “normalize” 
our  findings  with  others  that  have  previously 
appeared  in  the  literature. 

4.1.  Rayleigh- Benard  convection 

Data  from  a  mixture  of  ^He  and  superfluid 
“’He  in  a  Rayleigh-Benard  convection  cell  [57] 
provides  an  example  where  the  evidence  for 
nonlinear  structure  is  extremely  significant.  The 
significance  as  obtained  with  the  dimension  and 
forecasting  statistics  from  a  time  series  of  N  = 
2048  points  are  shown  in  fig.  8.  Further,  the 
dimension  statistic  indicates  that  the  flow  is  in 
fact  low-dimensional;  while  the  measured  dimen¬ 
sion  of  3.8  may  be  due  to  an  artifact  of  some 
kind,  we  are  at  least  assured  that  it  is  not  an 
artifact  of  autocorrelation  or  of  nongaussian  am¬ 
plitude  distribution.  Farmer  and  Sidorowich  [42] 
used  this  data  to  demonstrate  the  enhanced  pre¬ 
dictability  using  nonlinear  rather  than  linear  pre¬ 
dictors. 
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Fig.  8.  Data  from  a  fluid  convection  experiment  exhibits  very 
significant  nonlinear  structure,  using  (a)  dimension,  and  (b) 
forecasting  error.  The  top  panel  in  these  figures  show  the 
significance,  measured  in  “sigmas",  and  the  bottom  panel 
shows  the  values  of  the  statistics,  with  squares  (□)  for  the 
original  data  and  pluses  (  +  )  for  the  AAFT-generated  surro¬ 
gates.  Both  panels  plot  these  statistics  against  the  embedding 
dimension  m.  Not  only  is  the  evidence  for  non'mear  structure 
statistically  significant,  but  the  estimated  dimension  of  about 
V  =  3.8  suggests  that  the  underlying  dynamics  is  in  fact 
low-dimensional  chaos. 
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4.2.  The  human  electroencaphalogram  (EEG) 

The  electroencephalogram  (EEG)  is  to  the 
brain  what  the  electrocardiogram  (EKG)  is  to 
the  heart.  It  has  become  a  widely  used  tool  for 
the  monitoring  of  electrical  brain  activity,  and  its 
potential  for  diagnosis  is  still  being  explored.  A 
number  of  researchers  have  applied  the  methods 
that  were  developed  for  the  analysis  of  chaotic 
time  series  to  EEG  time  series.  While  it  was 
hoped  that  the  characterization  of  deterministic 
structure  in  EEG  would  eventually  lead  to  in¬ 
sights  about  the  workings  of  the  brain,  the  shor¬ 
ter  term  goal  was  to  use  the  nonlinear  properties 
of  the  time  series  as  a  diagnostic  tool  [58,  59]. 

Although  we  feel  a  more  systematic  survey  is 
in  order,  we  have  not  examined  any  EEG  data 
which  gives  positive  evidence  of  low-dimensional 
chaos.  However,  we  have  found  examples  where 
nonlinear  structure  was  evident.  We  present  here 
two  cases,  one  positive  and  one  negative.  The 
two  time  series  are  from  the  same  individual, 
eyes  closed  and  resting;  one  is  from  a  probe  at 
the  left  occipital  (Ol),  and  the  other  from  the 
left  central  (C3)  part  of  the  skull.  The  sampling 
rate  is  150  Hz,  and  N  =  2048  time  samples  are 
taken.  The  two  time  series  are  not  necessarily 
contemporaneous.  Using  the  dimension  statistic, 
the  first  data  set  shows  no  significant  evidence 
for  nonlinearity,  but  the  second  data  set  exhibits 
about  eight  sigmas.  Even  in  the  significant  case, 
we  do  not  see  any  evidence  that  the  time  series  is 
in  fact  low-dimensional  (the  correlation  dimen¬ 
sion  V  does  not  converge  with  increasing  embed¬ 
ding  dimension  m).  We  are  formally  able  to 
reject  the  null  hypothesis  that  the  data  arise  from 
a  linear  stochastic  process,  but  by  comparing  the 
surrogate  data  to  the  real  data,  we  see  no  reason 
to  expect  that  the  “significant”  data  arises  from  a 
low-dimensional  c'^aotic  attractor. 

4.3.  The  sunspot  cycle 

Our  final  example  is  the  well  known  and  much 
studied  eleven  year  sunspot  cycle  [44,  60-66). 


(a) 
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Fig.  9.  Data  from  two  electroencephalogram  (EEG)  time 
series.  Using  the  dimension  statistic,  the  first  (a)  shows  no 
nonlinear  structure,  while  the  second  (b)  exhibits  significant 
nonlinear  structure  at  the  eight  sigma  level.  The  evidence  for 
low-dimensional  chaos,  however,  is  weak,  since  the  estimated 
dimension  increases  almost  as  rapidly  with  embedding  dimen¬ 
sion  for  the  original  time  series  as  it  does  for  the  surrogates. 

First,  we  used  the  FT  algorithm  for  generating 
surrogate  data,  but  we  were  careful  to  use  a 
length  of  time  series  {N  =  287)  for  which  the  first 
and  last  data  point  both  corresponded  to 
minima;  this  avoids  introducing  the  spurious  high 
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Fig.  10.  Significance  of  nonlinearity  in  the  annual  sunspot 
series;  (a)  against  the  null  hypothesis  of  linear  gaussian  noise 
(surrogates  generated  by  FT  algorithm),  and  (b)  the  null 
hypothesis  of  amplitude  corrected  linear  gaussian  noise  (sur¬ 
rogates  generated  by  AAFT  algorithm).  For  both  plots,  the 
discriminating  statistics  are  estimated  dimension  (O),  log 
median  prediction  error  (□),  and  the  skew  statistic  described 
in  the  text  (O). 

frequencies  that  we  discussed  in  section  2.4.1.  As 
fig.  10a  shows,  it  is  possible  to  quite  confidently 
reject  the  null  hypothesis  of  linear  gaussian 
noise;  this  is  in  agreement  with  the  numerous 
authors  who  obtained  better  agreement  using 


nonlinear  models  instead  of  linear  models.  How¬ 
ever,  when  we  expand  the  null  hypothesis  to 
include  a  static  nonlinear  observation  of  an  un¬ 
derlying  linear  gaussian  process,  the  evidence  for 
dynamical  nonlinear  structure  is  less  dramatic. 
Using  the  dimension  statistic,  there  is  no  signifi¬ 
cance;  the  prediction  statistic  gives  that  the 
evidence  is  just  significant;  the  cubed  differ¬ 
ence  statistic  Q  =  -  x,f)  /  -  x,)-) , 

which  is  a  measure  of  the  time  irreversibility  of 
the  data,  provides  a  more  significant  rejection  of 
the  hypothesis  of  static  nonlinear  filter  of  an 
underlying  linear  process. 

5.  Comparison  to  other  work 

Numerous  authors  have  carefully  compared 
their  dimension  estimates  for  real  data  against 
similar  estimates  for  white  noise.  A  few  have 
extended  this  informal  control  to  other  forms  of 
correlated  noise.  Grassberger  [2]  showed  that  a 
reported  dimension  for  climate  data  could  be 
reproduced  with  data  from  an  Omstein-Uhlen- 
beck  process.  Osborne  et  al.  [5],  criticized  the 
Grassberger-Procaccia  algorithm  on  the  basis 
that  the  low  dimension  it  gave  to  nonstat-ouary 
data  on  ocean  currents  it  also  gave  lo  data 
generated  by  randomizing  the  phases  of  the 
Fourier  transform.  Kaplan  and  Cohen  [32]  ar¬ 
gued  that  fibrillation  was  not  usefully  described 
as  chaotic,  again  since  randomly  phased  data 
gave  similar  dimensions.  Smith  [67]  has  used  the 
FT  algorithm  to  generate  surrogates  which  are 
used  to  assess  the  predictability  of  geophysical 
time  series.  Weiss  [62],  described  a  comparison 
of  the  sunspot  time  series  against  a  particular 
stochastic  model.  Brock  et  al.  [52]  used  technical 
trading  rules  to  distinguish  stock  market  data 
from  surrogates  generated  by  several  stochastic 
models.  And  Ellner  [68]  showed  that  a  variety  of 
“plausible  alternatives”  might  adequately  ex¬ 
plain  measles  and  chickenpox  data,  despite  ear¬ 
lier  claims  of  chaos. 

Brock  and  coworkers  in  particular  [33,  52, 
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69-71],  and  the  economics  community  in  general 
[29,  34,  72,  73],  have  been  extremely  activ^e  in 
the  development  of  statistical  tools  for  time 
series  analysis.  While  the  choice  of  null  hypo¬ 
theses  for  financial  time  series  tends  to  be  differ¬ 
ent  than  for  more  physical  time  series  (autocor¬ 
relation  plays  a  lesser  role,  for  example),  the 
overall  methodologies  are  quite  similar.  Classical 
statisticians  [20-25,  28]  have  long  considered 
tests  for  nonlinearity,  and  are  becoming  increas¬ 
ingly  aware  of  low-dimensional  chaos  (just  as 
physicists  are  becoming  increasingly  aware  of  the 
importance  of  the  statistical  approach);  we  cite 
Tong  [26]  as  the  review  which  most  neatly  and 
comprehensively  ties  these  two  fields  together. 

6,  Conclusion 

In  this  article,  we  have  described  an  approach 
for  evaluating  the  statistical  significance  of  evi¬ 
dence  for  nonlinearity  in  a  stationary  time  series. 
The  test  properly  fails  to  find  nonlinear  structure 
in  linear  stochastic  systems,  and  correctly  iden¬ 
tifies  nonlinearity  in  several  well-known  exam¬ 
ples  of  low-dimensional  chaotic  time  series,  even 
when  contaminated  with  dynamical  and  observa¬ 
tional  noise.  We  illustrated  the  method  with 
several  experimental  data  sets,  and  confirmed 
the  evidence  for  nonlinear  structure  in  some 
systems,  while  failing  to  see  such  structure  in 
other  time  series. 
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Singular-spectrum  analysis  (SSA)  is  developed  further,  based  on  experience  with  applications  to  geophysical  time 
series.  It  is  shown  that  SSA  provides  a  crude  but  robust  approximation  of  strange  attractors  by  tori,  in  the  presence  of 
noise.  The  method  works  well  for  short,  noisy  time  series. 

The  lagged-covariance  matrix  of  the  processes  studied  is  the  basis  of  SSA.  We  seleet  subsets  of  eigenelements  and 
associated  principal  components  (PCs)  in  order  to  provide  (i)  a  noise-reduction  algorithm,  (ii)  a  detrending  algorithm,  and 
(iii)  an  algorithm  for  the  identification  of  oscillatory  components.  Reconstructed  components  (RCs)  arc  developed  to 
provide  optimal  reconstruction  of  a  dynamic  process  at  precise  epochs,  rather  than  averaged  over  the  window  length  of  the 
analysis. 

SSA  is  combined  with  advanced  spectral-analysis  methods  -  the  maximum  entropy  method  (MEM)  and  the  multi-taper 
method  (MTM)  -  to  refine  the  interpretation  of  oscillatory  behavior.  A  combined  SSA-MEM  method  is  also  used  for  the 
prediction  of  selected  subsets  of  RCs. 

The  entire  toolkit  is  validated  against  a  set  of  four  prescribed  time  series  generated  by  known  processes,  quasi-periodic 
or  chaotic.  It  is  also  applied  to  a  time  series  of  global  surface  air  temperatures.  130  years  long,  which  has  attracted 
considerable  attention  in  the  context  of  the  global  warming  issue  and  provides  a  severe  test  for  noise  reduction  and 
prediction. 


1.  Introduction 

1.1.  Motivation 

The  analysis  of  observed  time  series  is  often  a 
prerequisite  for  progress  in  modeling  and  fore¬ 
casting  the  physical  system  which  generates 
them.  Three  cases  have  to  be  distinguished. 
First,  when  the  evolution  equations  governing 
the  physical  system  are  already  known,  and  are 
relatively  insensitive  to  initial  data,  forecasting  is 
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based  on  these  equations  and  its  accuracy  de¬ 
pends  largely  on  the  quality  of  initial  data.  Such 
is  the  case  of  celestial  mechanics  [1],  on  the 
whole. 

For  other  systems,  the  knowledge  of  exact 
evolution  equations  is  useless  for  long-term  fore¬ 
casting  purposes,  even  when  good  initial  data  are 
available.  This  typically  happens  when  the 
dynamical  system  has  instabilities  and  non- 
linearities  that  give  rise  to  deterministic  chaos,  as 
shown  by  Lorenz  [2],  Chaos,  however,  does  not 
mean  that,  for  large  time  scales,  the  behavior  is 
totally  irregular  or  random.  Some  macroscopic 
regularities,  such  as  near  periodicities,  may  still 
contribute  a  large  part  of  the  variability  of  the 
system.  This  is  the  second  possibility:  we  know 
the  equations,  detailed  forecasting  based  on 
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them  for  a  long  time  is  impossible  because  of  the 
sensitivity  to  initial  data,  but  there  might  be 
other  ways  to  predict  the  regular  part  of  the 
asymptotic  behavior.  Indeed,  the  phase-space 
trajectories  of  such  a  chaotic  system  converge 
generically  to  a  strange  attractor.  Weakly  un¬ 
stable  periodic  orbits,  contained  in  the  latter,  can 
attract  trajectories  intermittently,  and  therefore 
lead  to  spells  of  periodic  activity.  The  underlying 
periodic  orbits  have  to  be  identified  by  data 
analysis,  and  can  help  extended-range  pre¬ 
diction. 

The  climatic  system  is  such  a  dynamical  sys¬ 
tem.  The  deterministic  predictability  limit  of  de¬ 
tailed  weather  is  not  longer  than  a  couple  of 
weeks  [3,4  pp.  182-190,  438-441].  On  longer 
time  scales,  instabilities  and  nonlinearities  make 
the  atmosphere  unpredictable.  However,  there 
are  some  near-periodicities  such  as  the  El  Nino- 
Southern  Oscillation  (ENSO)  cycle  in  the  atmos¬ 
phere  and  the  oceans  [5],  with  periods  of  two  to 
five  years,  or  the  40-50  day  oscillation  [6]  in  the 
tropical  atmosphere.  The  regularity  of  these  phe¬ 
nomena  can  make  them  easier  to  predict  with 
empirical  models  [7],  based  on  time-series  analy¬ 
sis,  than  with  elaborate  general  circulation  mod¬ 
els,  based  on  the  discretization  and  numerical 
solution  of  atmosphere-ocean-coupled  systems 
of  partial  differential  equations. 

The  third  class  includes  systems  with  unknown 
evolution  equations.  An  example  is  given  by 
complex  biomedical  systems,  such  as  the  human 
brain  [8].  Often,  only  noisy  measurements  of  one 
variable  from  an  intrinsically  high-dimensional 
system  are  available,  in  either  one  of  the  two 
latter  cases. 

The  purpose  of  this  paper  is  to  review  the 
capabilities  of  a  data-analysis  method,  called  sin¬ 
gular-spectrum  analysis  (SSA):  SSA  extracts  as 
much  reliable  information  as  possible  from  short 
and  noisy  time  series  without  using  prior  knowl¬ 
edge  about  the  underlying  physics  or  biology  of 
the  system;  based  on  this  information,  it  also 
provides  prediction  models.  If  only  measure¬ 
ments  of  one  variable  are  available,  single-chan¬ 


nel  SSA  applies.  When  several  variables  are 
measured,  the  cross-correlations  between  the 
time  series  can  be  taken  into  account  by  using 
multi-channel  SSA. 

SSA  is  essentially  a  linear  analysis  and  predic¬ 
tion  method.  Its  superiority  over  classical  spec¬ 
tral  methods,  and  the  sense  in  which  it  can  use 
concepts  from  and  be  useful  in  nonlinear 
dynamics,  lies  in  the  data-adaptive  character  of 
the  eigenelements  it  is  based  on.  Truly  nonlinear 
information  about  and  high  predictive  skill  for 
intrinsically  low-dimensional  systems  requires 
tens  of  thousands  of  data  points  [9],  and  many 
more  for  typical  systems  with  intermediate  and 
high  phase-space  dimensions.  As  we  shall  see, 
SSA  can  provide  useful  physical  insight  and 
modest,  but  unprecedented,  medium-term  pre¬ 
dictive  skill  starting  with  the  few  hundred  data 
points  typically  available  for  geophysical  and 
other  natural  systems. 

1.2.  Background 

SSA  as  a  data-analysis  method  has  been  used 
for  years  in  digital  signal  processing  [10, 11).  It 
was  introduced  into  oceanography  by  Colebrook 
[12],  and  into  nonlinear  dynamics  by  Broomhead 
and  King  [13]  and  by  Fraedrich  [14].  SSA  is 
based  on  principal  component  analysis  (PCA)  in 
the  vector  space  of  delay  coordinates  for  a  time 
series.  Classical  PCA  [15]  is  used  with  multi¬ 
channel  time  series,  and  gives  the  principal  axes 
of  a  sequence  of  M-dimensional  vectors  (A',,  1  < 
i  N),  by  expanding  it  with  respect  to  an  ortho¬ 
normal  basis  (£*,  1  <  A:  <  M): 

M 

X,,=  'La]E),  1</<A/.  (1.1) 

The  projection  coefficients  a]  are  called  the  prin¬ 
cipal  CK^mponents  (PCs),  and  the  basis  vectors  £* 
the  empirical  orthogonal  functions  (EOFs).  The 
vectors  E*  are  the  eigenvectors  of  the  cross¬ 
covariance  matrix  of  the  sequence  (A',).  For 
single-channel  SSA,  if  the  scalar  series  values  are 
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denoted  by  (Xj,  I  i  ^  N),  the  equivalent  expan¬ 
sion  is 

1<7<A/.  (1.2) 

*  =  i 

The  analogy  is  made  by  augmenting  the  single 
time  series  x,  into  the  multi-variate  time  series 
A',  =  +  .  -«,+«)•  Aside  from  this  de¬ 

finition,  there  is  no  formal  difference  between 
the  two  expansions  (1.1)  and  (1.2).  M  in  the 
latter  is  called  the  window  length,  or  embedding 
dimension  -  and  is  chosen  by  the  user  -  in  con¬ 
tradistinction  with  classical  PCA,  where  M  is  the 
fixed  dimension  of  the  data  vectors. 

The  vectors  £*  are  the  eigenvectors  of  the 
Toeplitz  matrix  of  jc,  T^,  that  contains  in  column 
j  and  row  i  the  covariance  of  x  at  lag  i  -  j.  In 
both  situations,  the  eigenvalue-eigenvector  de¬ 
composition  of  the  covariance  matrix  (with  re¬ 
spect  to  space  or  lag)  is  related  to  singular-value 
decomposition  (SVD  [16])  of  a  rectangular  ma¬ 
trix;  in  the  case  of  SSA,  the  trajectory  matrix  has 
the  N  -  M  augmented  vectors  A',  as  its  columns. 
SVD  is  a  class  of  algorithms  of  great  generality 
in  numerical  linear  algebra;  we  prefer  not  to 
confuse  matters  and  distinguish  between  it  and 
SSA,  which  is  a  methodology  for  time-series 
analysis. 

For  multi-channel  SSA  [17-19],  with  original 
L-dimensional  data  vectors  A^, ,,  I  ^  L,  1 
i  ^  N,  the  expansion  becomes 

LxM 

=  S  aXy  ’  1  ^  ^  T,  1  <  M  . 

(1.3) 

Here,  the  state  vector  considered  at  time  i  is 

(-^1.1  +  1’  X^  j  +  2i  •  •  •  ’  +  Xj  i  +  i,  ■  ■  ■  ,  -^2.i  +  Af’ 

.  .  .  ,  A'^.  i  +  .  . .  ,  M  is  still  the  window 

length,  but  now  the  problem  is  of  embedding 
dimension  Lx.  M.  The  kth  basis  vector  is  the 
eigenvector  of  the  block-Toeplitz  matrix  T,  con¬ 
taining  the  cross-covariance  coefficients  of  the 
different  channels  /  at  lags  0  to  A/  -  1 . 


y? 

The  three  formulae  ( 1 . 1 )-( 1 .3)  are  all  applica¬ 
tions  of  the  general  Karhunen-Loeve  bi-ortho- 
gonal  expansion  [20],  and  are  most  often  used  in 
signal  processing  for  information  compression 
and  signal-to-noise  (S/N)  ratio  enhancement. 
Usually,  the  eigenvalues  of  the  symmetric, 
nonnegative  covariance  matrix  of  the  problem 
are  sorted  in  descending  order.  The  orthogonali¬ 
ty  in  both  time  (zero  cross-covariance  of  two 
different  PCs  at  lag  0)  and  "space"  (orthogonali¬ 
ty  of  the  EOFs)  imply  in  particular  that  A*  is  the 
variance  of  the  Ath  PC.  Therefore,  truncating  the 
sum  in  eq.  (1.1)  at  an  order  p<  M  reduces  the 
information  to  the  first  p  principal  components, 
instead  of  the  M  initial  components.  This  trunca¬ 
tion  is  done,  in  PCA,  in  an  optimal  way:  the  first 
p  principal  directions  describe  the  largest  frac¬ 
tion  of  the  total  variance  that  one  can  obtain 
using  a  projection  onto  p  orthogonal  vectors. 

Vautard  and  Ghil  [21]  (VG  hereafter)  showed 
that  the  SSA  expansion  (1.2)  yields  other  power¬ 
ful  tools  for  time-series  analysis  than  information 
compression  and  S/N  enhancement,  and  applied 
these  to  a  set  of  paleoclimatic  time  series.  In 
particular,  the  near-equality  of  a  pair  of 
eigenelements  is  associated  with  periodic  activity 
in  the  signal.  In  contradistinction  from  classical 
spectral  analysis,  where  the  basis  functions  are 
prescribed  sines  and  cosines,  SSA  can  easily  and 
automatically  localize  in  time  intermittent  oscil¬ 
lation  spells.  The  shape  of  these  oscillations  is 
determined  adaptively  from  the  data,  which 
makes  SSA  more  flexible  and  better  suited  for 
the  analysis  of  nonlinear,  anharmonic  oscilla¬ 
tions. 

SSA  also  provides  a  qualitative  decomposition 
of  the  signal  into  its  significant  and  noisy  parts. 
Indeed,  in  the  presence  of  white  noise,  the 
eigenvalue  spectrum  levels  off  after  a  certain 
order,  and  the  PCs  of  higher  order  are  domi¬ 
nated  by  noise.  VG  showed  that  the  order  S  of 
this  break  in  the  eigenvalue  spectrum  and  the 
capacity  dimension  D  of  the  underlying  dynami¬ 
cal  system  are  not  equal  to  each  other,  even 
approximately,  but  that  SSA  can  help  verify  a 
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capacity  estimate  (see  eqs.  (4.2a)-(4.2d)  of 
VG). 

SSA  has  been  applied  by  now  to  over  a  dozen 
geophysical  data  sets,  on  time  scales  from  days 
to  millenia,  of  various  lengths  and  with  spatial 
extents  going  from  single  channel  to  hundreds  of 
grid  points.  Rasmusson  et  al.  [22]  showed  that 
the  irregular  ENSO  phenomenon  in  the  coupled 
ocean-atmosphere  system  contains  a  rather  reg¬ 
ular  quasi-biennial  signal  modulated  by  a  lower- 
frequency,  less  regular  4-5  year  oscillation.  Ghil 
and  Vautard  [23]  applied  SSA  to  a  135  year  long 
global-surface-temperature  time  series  and 
found  evidence  of  interannual  and  interdecadal 
oscillations,  confirmed  since  then  by  Allen  et  al. 
[24].  Ghil  and  Mo  [25]  gave  a  comprehensive 
description  of  intraseasonal  oscillations  in  the 
tropical  and  extratropical  atmosphere.  Multi¬ 
channel  SSA  applied  to  geopotential  height  data 
in  the  Northern  Hemisphere  extratropics  reveals 
cycles  with  periods  of  40-50  days  and  20-25  days 
[18, 19],  and  of  70  days  [19].  Penland  et  al.  [26] 
showed  that  SSA  prefiltering  allows  the  use  of 
the  maximum  entropy  method  (MEM)  with  low- 
order  autoregressive  (AR)  models  in  spectral 
estimation.  Based  on  this  combination  of  robust, 
low-order  AR  models  with  SSA,  Keppenne  and 
Ghil  [7]  predicted  the  Southern  Oscillation  index 
for  ENSO  with  considerable  skill  at  30  month 
lead  times. 

Each  one  of  these  applications  required  a  bet¬ 
ter  understanding  of  SSA’s  properties,  of  its 
strengths  and  weaknesses.  The  present  paper 
relies  heavily  on  the  experience  thus  accumu¬ 
lated,  and  goes  considerably  further  in  its  meth¬ 
odological  development  of  a  coherent  SSA 
toolkit. 

1.3.  Outline  of  the  present  study 

In  this  paper  we  concentrate  on  single-channel 
SSA.  The  studies  cited  in  the  previous  subsection 
raise  four  major,  and  several  minor  problems. 
First,  the  apparent  arbitrariness  in  the  choice  of 
window  length  M  gives  pause.  After  analyzing 


further  the  eigenvalue  problem  central  to  SSA  in 
section  2.1,  we  show  in  section  2.2  how  the 
ehoice  of  M  can  influence  the  analysis  and  the 
results.  The  possibility  to  vary  M  makes  in  fact 
SSA  much  more  adaptable  to  a  large  range  of 
time  scales  than  other  statistical  tools  such  as 
complex  PCA  [27],  or  principal  oscillation  pat¬ 
terns  (POPs  [28]). 

Another  major  problem  is  that  of  robustness 
and  statistical  confidence.  Analytical  formulae 
for  confidence  levels  on  the  eigenvalues  [29] 
apply  only  to  sets  of  independent  realizations. 
We  examine  this  problem  in  section  2.3,  using 
the  methodology  described  in  section  1.4.  Sec¬ 
tions  2.4  and  2.5  are  devoted  to  the  interpreta¬ 
tion  of  the  time-dependent  by-products  of  SSA. 
In  particular,  we  derive  in  section  2.5  an  al¬ 
gorithm  capable  of  extracting  the  components  of 
the  signal  corresponding  to  individual  eigenele- 
ments,  at  a  given  epoch. 

Third,  the  identification  of  noise  plateaus  in 
the  eigenvalue  spectrum  was  quite  subjective  in 
VG  and  most  of  the  subsequent  work  in  atmos¬ 
pheric  and  climatic  applications.  Indeed,  ob¬ 
served  time  series  do  not,  in  general,  provide  the 
ideal  break  described  above,  nor  is  noise  ever 
perfectly  white.  In  section  3.  objective  al¬ 
gorithms  for  noise  reduction  are  developed.  The 
analyzed  signal  is  decomposed  explicitly  as  the 
sum  of  an  intrinsic  dynamical  component  and  an 
external  noisy  component. 

Noise  reduction  is  a  well-known  signal¬ 
processing  problem.  Techniques  like  Kalman  fil¬ 
tering,  as  well  as  more  sophisticated  nonlinear 
dynamical  algorithms,  like  optimal  shadowing 
[30]  have  shown  very  satisfactory  results  in  noise 
reduction.  This  type  of  methods  requires,  how¬ 
ever,  a  knowledge  of  the  evolution  equations. 
Approximations  of  the  equations  can  be  found 
empirically,  as  shown  by  Casdagli  [31],  but  this 
in  turn  requires  very  long  data  sets  in  order  to 
provide  a  useful  reconstruction.  Quite  to  the 
contrary,  SSA  is  best  at  extracting  -  by  essential¬ 
ly  linear  but  data-adaptive  methods  -  useful  in¬ 
formation  about  nonlinear  systems  from  short. 
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noisy  time  series,  in  the  absence  of  -  or  without 
using  -  the  knowledge  about  governing  equa¬ 
tions. 

Finally,  the  interpretation  of  the  eigenele- 
ments  has  to  be  sharpened:  When  is  SSA  suc¬ 
cessful  at  removing  trends  and  nonstationarities? 
How  can  SSA  capture  nearly  periodic  behavior  - 
is  the  occurrence  of  a  pair  of  degenerate  eigen¬ 
values  always  associated  with  a  peak  in  the  spec¬ 
trum?  We  attempt  to  answer  these  two  questions 
in  sections  4.1  and  4.2,  respectively.  Maximum 
entropy  spectral  estimation  [32-34]  also  uses  the 
Toeplitz  matrix  in  order  to  determine  the 
associated  autoregression  (AR)  coefficients.  We 
propose  in  section  4.3  an  estimation  of  the  power 
spectrum  of  a  time  series  which  is  consistent  with 
both  MEM  and  SSA.  In  section  4.4,  SSA  is 
compared  with  multi-taper  spectral  estimation 
[35]. 

Linear  forecasting  algorithms,  based  on  SSA 
expansion  and  AR  models,  are  presented  in 
section  5,  following  Keppenne  and  Ghil  [7]. 
Concluding  remarks  appear  in  section  6. 

1.4.  Algorithm  validation 

In  order  to  validate  the  algorithms  we  develop 
here,  four  single-channel  processes  with  simple 
and  known  properties  are  analyzed  by  SSA.  For 
each  process,  100  Monte  Carlo  realizations  of 
150  points  are  generated.  These  sets  are  used  to 
provide  nonparametric  estimates  of  the  statistical 
confidence.  For  each  realization  of  each  process, 
SSA  is  applied  using  M  =  40  and  M  =  20,  in 
order  to  examine  the  influence  of  the  window 
length.  The  processes  x„,  l</i<4,  are  given 
below  by  defining  y„  and  w„  in 

=  +  (1-4) 

For  the  processes  1,  2  and  3  (PI,  P2  and  P3) 
y„  i  =  2cos(/2,/  +  «/>,)  +  cos(/22/  +  <f>2) ,  (L5) 

and  M'„ ,  is  a  Gaussian  white-noise  process  with 


variance  cr"  and  zero  expectation.  =  j^,Tr  and 
il2=  f  TT  are  fixed  frequencies,  with  a  long  com¬ 
mon  period  (equal  to  140),  while  and  are 
constant,  random  phases  depending  on  the  reali¬ 
zation;  since  the  common  period  is  of  the  same 
order  of  magnitude  as  the  interval  of  150  time 
units  over  which  the  processes  P1-P3  are  sam¬ 
pled,  the  behavior  of  P1-P3  approximates  that 
of  stochastically  perturbed  quasi-periodic  signals, 
where  no  common  period  is  present.  For  PI.  we 
take  O’"  =  2.5,  so  that  it  has  the  same  variance  as 
the  “signal”  y,.  For  P2,  a  larger  noise  variance  is 
taken,  a'  =4.  For  P3,  the  noise  variance  is  the 
same  as  the  variance  of  the  secondary  oscillation, 
0--  =0.5. 

For  the  fourth  process  (P4)  ,  are  the  con¬ 

secutive  values  of  the  T-variable  in  an  integra¬ 
tion  of  the  Lorenz  [2]  equations,  with  a  pre¬ 
dictor-corrector  scheme  using  a  time  step  of  0.02 
and  sampling  rate  of  0.1  (one  value  sampled 
every  5  time  steps).  The  parameter  values  are 
those  of  Lorenz  [2].  The  variance  of  the  white- 
noise  process  is  taken  equal  to  that  of  P2. 
tr'  =  4.  Fig.  1  shows  the  first  realization  of  each 
process.  The  dashed  curves  represent  the  first 
150  values  of  the  noisy  signal  x„  to  be  analyzed, 
and  the  dotted  ones  the  pure  signal  y„.  The  light 
solid  curves  are  the  processed  data  using  the 
noise-reduction  algorithm  described  in  section  4, 
where  this  figure  is  discussed  further. 

SSA  is  also  applied  to  the  IPCC  global  surface 
air  temperature  data  [36].  The  yearly  averaged 
values  are  shown  in  fig.  2.  The  data  studied  by 
Ghil  and  Vautard  [23]  were  slightly  different 
[37],  as  was  the  method.  The  IPCC  data  are 
quite  noisy,  but  show  a  marked  trend  as  well  as  a 
significant  year-to-year  variability  and  inter- 
decadal  oscillations.  This  data  set  is  studied  as  a 
worst-case  example:  the  time  series  is  short, 
noisy  and  nonstationary.  Fortunately,  the  data 
are  at  least  regularly  sampled!  As  we  do  not 
dispose  of  independent  realizations,  the  stability 
of  the  results  is  tested  by  repeating  the  analysis 
with  data  from  1861-1950.  1861-1951.  and  so 
on,  up  to  1861-1990. 
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Fig.  1.  One  realization  of  the  four  synthetic  processes  under 
study,  with  the  clean  signal  (dotted),  the  noisy  time  series 
(dashed),  and  the  noisy  series  after  application  of  the  SSA 
noise-reduction  filter  (solid);  Af  =  40.  (a)  PI,  (b)  P2,  (c)  P3 
and  (d)  P4;  note  the  scale  difference  on  the  ordinate  between 
panels. 


2.  Theoretical  and  computational  preliminaries 

2.1.  The  Toeplitz  matrix 

The  cornerstone  of  SSA  is  the  Karhunen- 
Loeve  expansion  theorem;  this  in  turn  is  based 
on  the  lagged-covariance  matrix  of  the  process  x. 
whose  sample  is  the  time  series  (x,,  1  <  /  <  /V), 
assumed  -  without  loss  of  generality  -  to  have 
zero  expectation  [38].  This  matrix  has  a  Toep¬ 
litz  structure,  i.e.,  constant  diagonals  corre¬ 
sponding  to  equal  lags; 

T  = 

X 

I  c(0)  c(l)  •  c(M-l)\ 

c(l)  c(0)  c(l) 

c(l)  •  • 

•  ■  c(l) 

\c(M-l)  •  •  •  c(l)  c(0) 

(2.1) 

where  c(y),  1,  is  the  covariance  of  x 

at  lag  j. 

There  are  different  ways  to  estimate  T, 
[29, 39,  40].  Among  the  most  frequently  used  are 
the  Yule-Walker  estimate 

j  N-i 

c(/)=  ^  E  ,  (2.2) 


Fig.  2.  The  IPCC  global  temperature  series  [36)  of  yearly  averages,  as  a  function  of  time.  The  grand  mean  has  been  removed, 
i.e.,  the  series  is  centered. 


R.  Vauiard  el  al.  /  SSA  :  A  toolkit  for  noisy  chaotic  sif;nals 


101 


as  well  as  the  estimate 

j  N-i 

=  TTZl  S  •  (2.3) 

J  1  =  1 

Burg’s  [32]  algorithm,  which  gives  the  AR  co¬ 
efficients  associated  with  MEM,  also  estimates 
implicitly  c(/);  this  estimate  is  again  an  average, 
like  in  (2.2,  2.3),  but  with  larger  weights  toward 
the  middle  of  the  series  than  at  the  ends  (see  also 
refs.  [26,41]). 

The  first  estimate,  eq.  (2.2),  used  by  Box  and 
Jenkins  [39],  is  strongly  biased  when  the  number 
of  data  N  is  small;  the  estimate  in  (2.3)  has 
larger  variance  but  less  bias  [29].  When  N  is 
small,  the  Burg  estimate  can  also  be  biased  at 
large  lags  j,  if  there  is  power  at  periods  larger 
than  N,  as  shown  by  the  following  example: 
consider  the  process  x,  =  cos(  75-,  tt/  +  <^),  where 
<t>  is  a  random  constant  phase,  and  N  =  100.  Fig. 
3  shows  the  average  of  the  various  estimates.  In 
this  case,  (2.3)  is  slightly  biased,  whereas  both 
(2.2)  and  the  Burg  estimates  are  strongly  biased. 
This  problem  occurs  only  when  there  are  very 
low  frequencies  in  the  system,  or  trends.  Once 
those  frequencies  are  removed,  both  the  Burg 
estimate  and  (2.3)  are  equivalent. 

One  could  also  estimate  the  covariance  matrix 
of  the  process  as  in  classical  PCA,  by  averaging 
over  lagged  copies  of  the  window  [42].  This  has  a 
double  disadvantage:  (i)  it  does  not  conserve  the 


Fig.  3.  Autocovariance  function  of  the  process  x,  - 
cos(T^rr/):  true  (heavy  solid  curve)  versus  three  different 
estimates  from  a  100  point  long  time  series,  as  described  in 
the  text  (see  inset  for  legend). 


Toeplitz  properties  of  the  sample  covariance  (see 
the  next  subsection),  and  (ii)  it  tends  to  give,  like 
the  Burg  estimate,  larger  weights  to  the  middle 
of  the  series  and  thus  produce  large  biases  as 
well.  In  the  absence  of  prior  information  about 
the  signal,  we  prefer  therefore  the  estimate 
(2.3),  and  use  it  hereafter  to  compute  the  Toep¬ 
litz  matrix. 

2.2.  Eigenelements  and  choice  of  window 
length 

The  Toeplitz  matrix  T,  is  symmetric  and  non¬ 
negative.  Its  eigenvalues  A*  are  strictly  positive, 
except  when  data  are  perfectly  clean  and  come 
from  a  dynamical  system  with  purely  quasi- 
periodic  behavior;  in  the  latter  case,  all  but  a 
finite  number  -  equal  to  twice  the  number  of 
frequencies  -  are  zero.  They  are  sorted  into  de¬ 
creasing  order,  and  the  eigenvectors  £*  are  nor¬ 
malized  so  that 

M 

Z  E)e)  =  S,,  .  1  <  ;•  <  M.  1  <  /  <  Af  ,  (2.4) 

k  =  I 

and  that  the  spectral  decomposition  formula 
should  hold: 

M 

S  KE]E)  =  T,„  =  c{i-l), 

*  =  i 

l</< /Vf,  1  .  (2.5) 

An  eigenvector  here  is  a  lag  sequence  of  length 
M.  Since  is  a  Toeplitz  matrix,  eigenvectors 
are  either  symmetric  or  antisymmetric  with  re¬ 
spect  to  2  A/.  When  the  sampling  rate  of  the 
signal  is  increased,  for  a  given  window  length  in 
time  units,  the  shape  of  the  first  eigenvectors 
does  not  change  much,  as  shown  by  VG;  thus, 
the  eigenvectors  have  a  limit  as  the  sampling 
interval  goes  to  zero.  On  the  other  hand,  if  the 
embedding  dimension  is  constant  and  the  sam¬ 
pling  interval  goes  to  zero,  i.e.,  the  window 
length  also  goes  to  zero,  the  eigenvectors  tend  to 
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fixed  linear  combinations  of  the  M  successive 
time  derivative  operators  [43]. 

For  fixed,  finite  window  length,  Fortus  [44]  has 
shown  that  the  largest  eigenvalue  corresponds  to 
the  maximum  of  the  spectrum,  under  certain 
restrictive  assumptions.  In  the  asymptotic  limit 
+30,  with  fixed  sampling  interval,  i.e.,  the 
window  length  goes  to  infinity,  the  eigenvectors 
tend  to  pairs  of  sines  and  cosines  and  the  associ¬ 
ated  eigenvalues  tend  to  the  corresponding  spec¬ 
tral  density  values  [45,  21].  For  finite  M,  all 
eigenvalues  fall  between  the  maximum  and  the 
minimum  of  the  spectral  density  [46]. 

A  key  problem  in  SSA  is  the  proper  choice  of 
M.  As  we  shall  see  forthwith,  SSA  does  not 
resolve  periods  longer  than  the  window  length. 
Hence,  if  one  wishes  to  reconstruct  a  strange 
attractor,  whose  spectrum  includes  periods  of 
arbitrary  length,  the  larger  M  the  better,  as  long 
as  statistical  errors  do  not  dominate  the  last 
values  of  the  autocovariance  function.  To  pre¬ 
vent  this,  one  should  not  exceed  M  =  ^N. 

In  many  physical  and  engineering  applications, 
however,  one  wishes  to  concentrate  on  oscilla¬ 
tory  phenomena  in  a  certain  band,  which  may  be 
associated  with  the  least-unstable  periodic  orbits 
embedded  in  a  strange  attractor.  Such  periodic 
orbits  typically  generate  oscillations  of  strongly 
varying  amplitude  [18,  19.  23,  25]:  the  system’s 
trajectory  approaches  and  follows  them  for  a 
certain  time  -  comparable  to  or  longer  than  the 
period  in  question  -  only  to  wander  off  into 
other  parts  of  phase  space.  When  the  ratio  of  M 
to  the  life  time  of  such  an  intermittent  oscilla¬ 
tion  -  the  typical  time  interval  of  sustained  high 
amplitudes  -  is  large,  the  corresponding  eigen¬ 
vector  pair  suffers  from  the  same  Gibbs  effect  as 
in  classical  spectral  analysis  (see  section  2.4 
below).  Spells  of  the  oscillation  -  weak  or 
strong -will  be  smoothed  out.  The  following 
arguments  should  help  understand  the  difficulty 
and  make  the  correct  choice  of  M. 

Let  us  denote,  by  analogy  with  the  time- 
continuous  case  treated  in  section  2  of  VG,  by 
£*(/)  the  reduced  Fourier  transform  of  £*,  i.e.. 


M 

^*(/)  =  S  £*  explZiriy/)  .  (2.6) 

/  I 

£*(/)  is  also  the  response  function  of  the  filter 
transforming  x  into  its  A:th  PC.  £*(  /)  is  a  sum  of 
periodic  functions  of  the  frequency  /  with  periods 
1,  s,  \  , .  .M  M.  Therefore  the  spectral  resolu¬ 
tion  is  MM. 

If  M  is  too  small,  the  coarse  resolution  may 
cause  several  neighboring  peaks  in  the  spectrum 
of  X  to  coalesce.  When  there  is  an  intermittent 
oscillation,  reflected  by  a  broad  spectral  peak,  on 
the  contrary,  large  M  values  (high  resolution) 
will  split  the  peak  into  several  components  with 
neighboring  frequencies.  Eq.  (2.6)  also  shows 
that  the  filters  are  unable  to  isolate  peaks  at 
frequencies  lower  than  the  resolution  MM.  i.e., 
periods  larger  than  M.  Given  a  peak  in  the 
power  spectrum  £,(/)  of  j:  -  with  maximal  spec¬ 
tral  density  at  and  width  2  8/,  say  -  these  con¬ 
siderations  suggest  that  SSA  will  isolate  correctly 
the  intermittent  oscillation  if 


In  other  words,  the  window  length  has  to  be 
chosen  between  the  period  of  the  oscillation  and 
the  average  life  time  of  its  spells.  In  practice,  this 
latter  quantity  cannot  be  estimated  a  priori,  but 
SSA  is  typically  successful  at  analyzing  periods  in 
the  range  (\M.  M). 

2.3.  Statistical  stability  of  eigenvalues 

Statistical  stability  is  crucial  in  spectral  analy¬ 
sis,  SSA  is  based  on  estimates  of  lagged  au¬ 
tocovariances,  i.e.,  on  second-order  moments. 
The  eigenvalues  should  converge  therefore  at 
least  as  well  as  Blackman-Tukey  spectral  esti¬ 
mates,  heuristically  speaking.  Estimating  the 
statistical  confidence  to  be  placed  in  the  eigen¬ 
vectors,  however,  is  rather  difficult.  Simple  con¬ 
fidence-interval  formulae  were  used  by  Fraedrich 
[14],  VG,  and  Ghil  and  Mo  [25],  who  all  assume 
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the  independence  of  successive  points  of  the  dominated  components.  Note  that  the  tails  of  the 

augmented  time  series,  either  one  sampling  time  spectra  are  not  fiat.  This  is  essentially  due  to  the 

or  one  window  length  apart.  Here  we  use  a  finiteness  of  the  data. 

nonparametric  Monte  Carlo  method  to  check  for  Processes  P2  and  P3  (figs.  4b,  4c)  behave  in 

stability,  as  we  dispose  of  several  realizations  of  the  same  way  as  PI,  with  different  noise 
the  synthetic  processes  P1-P4.  plateaus.  The  second  pair  of  P2  stands  out  less 

Fig.  4  shows  the  average  of  the  eigenvalues  A*,  from  the  rest  of  the  spectrum,  since  it  is  reached 

calculated  with  the  100  randomly  generated  data  by  the  noise.  We  anticipate  that  for  P2  SSA 

sets,  for  the  four  synthetic  processes,  with  M  =  would  not,  based  on  most  realizations,  identify 

40  (solid  circles)  and  M  =  20  (open  circles).  The  the  secondary  oscillation  of  period  7  with  great 

95%  error  bars  are  calculated  as  ±  1.96<t^,  confidence.  For  P3,  the  level  of  noise  is  low,  and 

where  tr*  is  the  variance  -  estimated  with  the  100  the  last  average  eigenvalue  is  even  negative  (not 

realizations  -  of  A*.  Also  shown  are  the  confi-  shown  on  this  logarithmic  plot).  Indeed,  the 

dence  intervals  estimated  with  the  heuristic  var-  covariance  estimate  (2.3)  may  lead  to  a  nonposi- 

iance  formula  of  Ghil  and  Mo  [25),  =  tive  Toeplitz  matrix,  but  negative  eigenvalues 

MI2N,  for  PI  only;  the  latter  is  quite  con-  are  quite  rare,  and  small  in  absolute  value.  For 

servative,  except  for  the  smallest  A^’s.  the  stochastically  perturbed  Lorenz  process  (P4: 

For  PI  (fig.  4a),  the  two  spectra  are  quite  fig.  4d),  no  clear  break  is  apparent,  although  an 

similar,  with  two  leading  pairs  standing  out,  and  inflection  point  of  the  average  spectrum  lies  at 

a  regular  weakly  descending  ramp  after  k  =  5.  k-\M. 

These  two  pairs  are  associated  with  the  periodic  The  near-disappearance  of  the  second  pair 
components  of  the  signal,  whereas  the  slowly  into  the  noise  level  for  P2  and  M  =  20  raises 

decreasing  part  corresponds  to  the  white-noise  another  point  to  consider  in  choosing  the  win- 


Fig.  4.  SSA  eigenvalue  spectra,  for  M  -40  (averages  shown  as  solid  circles),  and  M  =  20  (open  circles),  with  the  95%  confidence 
limits  calculated  using  the  actual  variance  of  the  100  realizations  (thick  bars:  Af  =  40;  thin  bars:  M  =  20).  (a)  PI.  (b)  P2.  (c)  P3  and 
(d)  P4.  At  the  top  of  panel  (a)  are  represented  also  the  error  bars  (thin  long  I’s)  estimated  by  the  heuristic  variance  formula  of 
Ghil  and  Mo  [25];  note  that  the  latter  is  quite  conservative. 
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dow  length:  if  the  process  is  a  pure  sine  function 
of  variance  v,  the  two  nonvanishing  eigenvalues 
are  close  to  v' =  {Mv  (compare  eqs.  (2.7a)- 
(2.7e)  of  VG).  On  the  other  hand,  if  the  process 
is  noisy,  the  noise  floor  always  lies  at  the  value  of 
the  noise  variance,  for  all  values  of  M  (with 
different  slopes).  Therefore,  some  pairs  can  be 
extracted  from  the  noise  level  by  increasing  M. 
The  fact  that  the  noisy  part  of  the  eigenvalue 
spectrum  is  flatter  for  larger  M  values  enhances 
pair  detection  further  as  M  is  increased. 

For  the  IPCC  data,  the  eigenvalue  spectra  (not 
shown)  present  no  evidence  of  a  noise  floor,  and 
are  rather  similar  to  the  spectrum  [23]  of  the 
Jones  et  al.  [37]  data.  Since  the  length  of  the 
record  is  small  (130  numbers),  however,  white- 
noise  floors  may  be  steep  and  hardly  recogniz¬ 
able  from  signal.  The  quantitative  method  de¬ 
veloped  in  section  3  shows  that,  in  fact,  about  20 
eigenvalues  are  above  the  noise  floor.  The  first 
two  eigenvalues,  representing  the  nonstationarity 
of  the  data,  are  about  one  order  of  magnitude 
above  the  other  ones. 

When  SSA  is  applied  to  the  IPCC  data,  with 
the  ending  year  varying  continuously,  the  order 
of  the  eigenvalues  is  not  stable,  whereas  the 
shape  of  the  eigenvectors  is,  i.e.,  it  is  possible  to 
follow  the  eigenvectors  continuously,  from  end¬ 
ing  year  to  ending  year,  but  the  associated  eigen¬ 
values  may  undergo  exchanges  in  their  respective 
order  (see  also  refs.  [24,42]). 

2.4.  Principal  components  (PCs) 

The  kth  PC  is  defined  as  the  orthogonal  pro¬ 
jection  coefficient  of  the  original  series  onto  the 
kth  EOF: 

M 

,  0  <  t  <  N  -  M  .  (2.8) 

/“I 

PCs  are  therefore  processes  of  length  N  -  M  +\, 
which  can  be  considered  as  weighted  moving 
averages  of  the  process  x.  If  we  denote  by  B  the 


backward  shift  operator,  and  by  the  poly¬ 
nomial 

’P*(^)  +  +  ,  (2.9) 

then  the  /cth  PC  can  be  written  as 

a)  =  %iB)x,^^.  (2.10) 

Like  in  classical  PC  A,  principal  components 
are  orthogonal  to  each  other,  i.e.,  ^{a'‘a‘)  = 
where  ^  is  the  expectation  operator.  It 
does  not  mean  that  the  PCs  are  independent 
from  each  other,  since  this  relation  holds  for 
SSA  only  at  lag  zero.  PCs  give  the  representa¬ 
tion  of  the  augmented  time  series  in  a  new 
coordinate  system,  with  most  information  repre¬ 
sented  along  the  first  coordinates. 

The  PCs  can  be  interpreted  in  another  way. 
Let  us  consider  the  portion  of  signal  x  contained 
between  instants  i  -+■  1  and  i  +  M.  and,  for  a 
given  k,  the  function 

M 

J{a)=^(x.,,-aE)f  .  (2.11) 

/=i 

It  is  easy  to  show  that  /(a)  is  minimum  for 
a  =  a*,  and  hence  that  the  PCs  can  be  obtained 
from  the  local  fit,  in  the  least-squares  sense,  of 
the  Pth  EOF  to  the  original  series  x.  This  proper¬ 
ty  is  conserved  if  one  fits  several  EOFs,  i.e.,  the 
PCs  are  the  coefficients  of  the  linear  combination 
of  any  subset  of  EOFs  that  minimizes  the  square 
distance  to  the  series  x  over  the  window  consid¬ 
ered.  The  implication  is,  again,  that  if  an  inter¬ 
mittent  oscillation  has  a  typical  life  time  shorter 
than  the  window  length,  Gibbs  effects  will  re¬ 
duce  the  amplitude  of  the  fit  within  the  spells 
and  produce  artificial  periodicity  off  the  spells. 

From  the  spectral  point  of  view,  EOFs  corre¬ 
spond  to  data-adaptive  moving-average  filters. 
The  power  spectrum  P*(/)  of  a*,  at  frequency  /, 
is 

PAf)  =  PAf)\EV)W 


(2.12) 
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with  £*  being  given  by  (2.6),  or 

Pgif)  =  PAf)\%U)\\  (2.13) 

for  I  =  exp(2Tri/).  The  orthogonality  constraints 
of  the  problem  give  the  identity 

1  =  ^  i  \E\f)\^  (2.14) 


N-M  M  r  V  2 

HAy)=  2  2  (y,.,-  2  a.E))  (2.16) 

/=(l  y=l  '  ' 

is  minimized.  In  other  words,  the  optimal  series 
y  is  the  one  whose  augmented  version  Y  is  the 
closest,  in  the  least-squares  sense,  to  the  projec¬ 
tion  of  the  augmented  series  X  onto  EOFs  with 
indices  belonging  to  M.  The  solution  y  =  R,^x  to 
this  least-squares  problem  is  given  by 


for  any  frequency  /,  so  that  the  sum  of  the 
spectra  of  the  PCs  is  identical  to  the  power 
spectrum  of  x,  i.e.. 


1 

(R^x),  =  ^22  al^E) 

tri  j=l  ks.'il 

for  Af  <  /  <  N  -  Af  -f  1  , 


(2.17a) 


^,(/)=^  2  P*(/).  (2.15) 

It  is  thus  interesting  to  build  stack  spectra  by 
piling  up  the  contributions  of  the  various  compo¬ 
nents  (see  section  4.3). 

2.5.  Reconstructed  components  (RCs) 


(R^x),  =  ]tl  al^E) 

‘  y=i  kesi 


for  1  s  /  <  Af  ~  1  , 
1 


(2.17b) 


M 


2  2 

^  =  i-^  +  M  kBst 

tor  N  -  M  +  2‘S:  i  N  . 


N-i  +  \ 


(2.17c) 


PCs  are  filtered  versions  of  the  original  series. 
However,  they  do  not  allow  a  unique  expansion 
of  the  signal  into  a  sum  of  the  different  compo¬ 
nents.  Indeed,  in  the  expansion  (1.2),  individual 
terms  depend  on  the  index  /,  varying  from  1  to 
M.  Therefore,  there  are  M  different  ways  of 
reconstructing  the  components  of  the  signal, 
which  do  not  give,  in  general,  the  same  results. 
Another  problem  in  using  eq.  (1.2)  for  filtering, 
and  in  particular  for  real-time  filtering  and  pre¬ 
diction,  is  that  the  resulting  series  are  of  length 
N  -  M  +  \,  and  not  of  length  N  as  desired.  We 
show  here  how  to  extract,  in  an  optimal  way, 
series  of  length  N  -  corresponding  to  a  given 
set  of  eigenelements  -  that  we  shall  call  recon¬ 
structed  components  (RCs). 

Let  us  consider  a  subset  sd  of  eigenelements  k 
over  which  the  reconstruction  is  to  be  per¬ 
formed.  By  analogy  with  eq.  (1.2),  we  seek  a 
series  of  length  N,  y  =  R^x,  such  that  the 
quantity 


When  si  consists  of  a  single  index  k,  the  series 
R^x  is  called  the  A:th  RC,  and  will  be  denoted  by 
jc*.  RCs  have  additive  properties,  i.e., 

R^x='Zx'‘.  (2.18) 

In  particular,  the  series  x  can  be  expanded  as  the 
sum  of  its  RCs: 

M 

x='Zx'‘.  (2.19) 

*=1 

Note  that,  despite  its  linear  aspect,  the  transform 
changing  the  series  x  into  x*  is,  in  fact,  non¬ 
linear,  since  the  eigenvectors  £*  depend  non- 
linearly  on  x.  A  drawback  of  RCs  is  that  they  are 
correlated  even  at  lag  0. 

The  main  advantage  of  using  RCs  instead  of 
PCs  is  the  recovery  of  the  epochs;  indeed,  if 
there  are  short  spells  of  oscillations  in  the  signal, 
PCs  do  not  allow  to  localize  them  precisely. 
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whereas  RCs  do.  Moreover,  there  is  no  phase 
shift  between  x  and  x*‘,  except  possibly  near  the 
ends  of  the  series:  In  the  middle  of  the  data 
set -where  (2.17a)  is  used -the  transform  from 
X  to  R^x  is  a  linear  filter  whose  response 
function 

E  |^*(/)|^  (2.20) 

ke.ii 

is  real  valued.  Therefore,  at  no  frequency  is 
there  a  phase  shift  between  x  and  x' .  Within  M 
points  of  the  ends  of  the  series,  however,  there 
might  be  some  phase  shift.  This  effect  is  small  as 
long  as  ^  is  large,  or  suitably  chosen,  and,  of 
course,  totally  disappears  when  = 
{1, .  .  .  ,  M}.  Based  on  a  number  of  SSA  experi¬ 
ments,  we  also  found  that  there  is  no  phase  shift 
at  the  ends  of  the  series  when  s/i  is  made  up  of  an 
oscillatory  pair  (k,  k  +  1). 

Our  reconstruction  algorithm  may  be  com¬ 
pared  to  the  Wiener  filtering  method  [47],  which 
also  provides  optimal  filters  in  a  least-squares 
sense.  The  main  differences  are  (i)  that  the 
Wiener  filter  uses  harmonic  functions  as  a  basis, 
and  (ii)  that  it  is  not  fully  adaptive,  in  the  sense 
that  one  has  to  prescribe  the  shape  of  the  desired 
filtered  spectral  density.  Smooth  and  very  reli¬ 
able  estimates  of  the  power  spectrum  are  there¬ 
fore  required  in  the  Wiener  method,  which  are 
impossible  to  obtain  with  short  data  sets. 


3.  Noise  reduction 

The  simplest  kind  of  noise  reduction  is  by 
applying  a  fixed,  prescribed  low-pass  filter  to  the 
data.  This  procedure  is  successful  when  the 
power  spectrum  of  the  process  is  rapidly  decreas¬ 
ing  to  zero.  When  the  spectrum  is  not  mono¬ 
tonic,  and  has  lines  or  peaks  distributed  over  a 
wide  range,  the  problem  is  more  complicated. 
The  gaps  between  these  significant  elements  in 
the  spectrum  are  filled  by  spurious  noise,  and  to 


filter  out  this  noise  requires  more-complex  fil¬ 
ters.  As  stated  at  the  end  of  the  last  section,  it  is 
still  possible  to  use  the  Wiener  method  in  this 
case,  but  only  by  assuming  either  accurate  spec¬ 
tral  estimates  or  prior  hypotheses  on  the  noise 
variance;  the  former  is  impossible  with  short 
data  sets,  the  latter  is  arbitrary. 

In  this  section  we  show  that  SSA  is  a  powerful 
tool  for  signal  reconstruction  from  noisy  data. 
We  assume  that  we  have  at  our  disposal  a  reali¬ 
zation  of  a  finite  process  x  of  length  A,  which  is 
the  sum  of  a  dynamical  process  y  and  of  some 
external  noise  process  w  due  to  measurement 
errors.  In  VG,  we  showed  that  the  last  PCs  of 
such  processes  are  dominated  by  noise,  after  the 
break  in  the  eigenvalue  spectra.  We  develop 
here  a  systematic  method  for  determining  the 
break,  and  derive  a  process  R  ,^x.  with  si  = 
{1, .  .  .  ,  5},  approximating  y  with  the  knowledge 
of  only  a  few  x-data. 

Fig.  5  shows  the  average  of  the  ratio  n(p)  as  a 
function  of  p,  over  the  100  realization  sets  of  the 
four  synthetic  processes. 


«(p)  = 


S;:,(y,-S^,x^)^  _ 


(3.1) 


n(  p)  represents  the  noise-reduction  ratio  when  p 
RCs  are  considered.  For  PI,  P2  and  P3  (figs. 
5a-5c),  the  average  optimum  noise  reduction  is 
at  p  =  3  or  p  =  4,  and  the  reduced  noise  is  less 
than  10%  of  the  original.  With  M  =  20,  the  best 
reduction  that  can  be  achieved  is  not  quite  as 
good  as  with  M  =  40.  Even  with  short  data  sets 
and  dominant  noise,  like  for  P2,  reconstruction 
with  p  =  3  or  4  is  very  close  to  the  underlying 
signal.  For  the  process  P4  (fig.  5d),  the  optimal 
reduction  -  to  about  0.6 -occurs  around  p  = 
\M.  This  relatively  bad  score  is  due  to  the  fact 
that  the  Lorenz  system  has  a  monotonically  de¬ 
creasing  power  spectrum,  with  no  particular  fre¬ 
quency  standing  out,  and  therefore  any  linear 
noise-reduction  technique  unavoidably  filters  out 
a  part  of  the  signal  as  well. 

In  the  real  world,  we  do  not  have  several 
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Fig.  5.  Average  error  made  when  approximating  the  clean  signal  Kv  the  reconstruction  filters  R^.  with  .V  =  { 1 . p).  divided 

by  the  noise  variance,  as  a  function  of  p.  for  M  =  20  (dashed)  and  Af  =  40  (solid).  Dotted  curves  show  the  95%  confidence 
interval  for  M  =  40.  (a)  PI,  (b)  P2,  (c)  P3  and  (d)  P4. 


realizations  of  the  same  process  at  our  disposal, 
and  the  pure  signal  is  not  known.  The  optimal 
order,  therefore,  needs  to  be  calculated  in  a 
different  way.  For  a  given  order  p,  let  us  denote 
again  by  M  the  subset  (1, .  .  ,  p),  and  by  si'  the 
subset  {p  +  l,...,Af}.  The  algorithm  is  based 
on  the  remark  that  if  the  reconstructed  part  of  x 
involving  the  subset  ^ '  of  indices  is  really  domi¬ 
nated  by  white  noise,  it  should  not  be  signifi¬ 
cantly  different  from  the  reconstruction,  using 
the  same  EOFs,  of  some  pure  gaussian  white- 
noise  process.  Let  i;  be  a  normal  white-noise 
process.  The  problem  is  to  find  a  positive  num¬ 
ber  jS,  such  that  R^x  behave  like  R^,{pv).  In 
this  case,  at  least  the  autocovariance  functions  of 
R^.x  and  R^.(pv)  should  be  nearly  equal  at  lags 
0  to  M  -1.  That  is,  if  SSA  is  reapplied  to  the 
two  processes,  it  will  provide  statistically  indist¬ 
inguishable  results. 

To  test  this  idea,  we  use  again  Monte  Carlo 
simulation.  This  type  of  simulation,  going  back 
to  the  work  of  Ulam  in  the  1950s,  has  been  used 
extensively  of  late  in  meteorological  time-series 
analysis  [48];  it  has  been  applied  recently  in 


nonlinear  dynamics  under  the  name  of  the  surro¬ 
gate-data  method  [49].  We  generate  100  normal 
realizations  of  length  N  of  the  white-noise  pro¬ 
cess  V.  The  operator  R^.  is  applied  -  for  a  given 
order  p  -  to  the  100  realizations  of  v  and  to  the 
signal  X.  Let  us  denote  by  w  the  realization,  by 
c„(/),  0^/:^  A/  —  1,  the  autocorrelation  of  the 
reconstructed  cvth  realization  R^v  at  lag  j,  and 
by  Cp{j)  the  autocorrelation  of  R,^  x.  The  aver¬ 
age  c„(/)  over  these  realizations,  the  variances 
5„(/)  of  these  estimates,  and  the  95%  confidence 
intervals  (c,(/),  c,(/)),  with  c^{j)  =  c^.ij)± 
i.96V^„(;),  are  estimated.  Then  we  look  for  )3 
such  that,  for  0^y<A/-l,  c^d)  lies  in  the 
interval  (/3^c_(y),  )8‘c .,(/));  /3  must  therefore  lie 
in  the  intersection  of  M  intervals  and  must  be 
positive.  If  this  intersection  is  empty,  the  null 
hypothesis  of  white-noise  dominated  behavior  of 
the  last  M  -  p  components  is  rejected.  Other¬ 
wise,  this  intersection  is  itself  an  interval 
(y^,  8p),  and  the  last  M  -  p  RCs  may  be  consid 
ered  as  a  reconstruction  of  mere  white  noise. 
The  smallest  p  satisfying  the  above  condition  is 
called  the  statistical  dimension  of  the  data  set. 
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and  is  denoted  here  (as  in  VG)  by  S.  The 
interval  (7^  ,  5^)  gives  bounds  for  the  stai.oard 
deviation  j8  of  the  noise  present  in  the  data. 

As  shown  by  VG,  5  has  nothing  to  do  -"h  the 
dimension  of  an  underlying  attractor:  5  simply 
gives  the  number  of  significant  components  in  an 
SSA  expansion;  it  depends  almost  linearly  on  the 
window  A/  for  synthetic  and  observed  data,  al¬ 
though  VG  only  used  a  heuristic  criterion  for 
finding  5.  For  almost  all  p>  S,  the  conditions  of 
nonrejection  are  fulfilled;  sampling  errors  are 
responsible  for  the  other  cases.  For  p  =  S  and 
j:^  =  y=  {l,...,5},we  obtain 

X  =  R.^x  +  R;^.x  .  (3.2) 

R;^x  is  an  approximation  of  y, 

y  =  R;fX  +  e  ,  (3.3) 

where 

e  =  R;f.y  -  /?yW  .  (3.4) 

The  error  e  made  when  approximating  y  by  the 


significant  reconstructed  part  R jX  of  x  is  due  to 
the  difference  of  two  quantities.  The  first  one 
describes  the  part  of  signal  that  has  been  re¬ 
moved  by  the  reconstruction  filter;  ihe  second 
one  is  the  part  of  the  noise  that  has  not  been 
removed  by  it.  The  skill  of  the  noise-reduction 
algorithm  depends  on  minimizing  both  of  these 
spurious  quantities.  The  quantity  plotted  in  fig.  5 
shows,  in  fact,  the  average  variance  ratio  of  e  to 
w  as  a  function  of  p. 

The  first  contribution  to  e  dominates  for  small 
values  of  p,  and  is  a  monotonically  decreasing 
function,  whereas  the  second  contribution  domi¬ 
nates  for  large  values  of  p,  and  is  monotonically 
increasing.  Processes  for  which  the  noise  reduc¬ 
tion  is  successful  are  those  for  which  the  variance 
of  the  first  contribution  in  e  decreases  rapidly 
with  p,  i.e.,  processes  for  which  a  few  compo¬ 
nents  explain  a  large  part  of  the  variance.  Quasi- 
periodic  processes  and  processes  dominated  by 
low  frequencies  fall  within  this  category.  More 
generally,  the  algorithm  will  be  successful  when 
a  significant  part  of  the  power  lies  within  a  small 
fraction  of  the  frequency  range  of  interest. 

Fig.  6  shows  the  distribution  of  5  obtained 


Fig.  6.  Histograms  of  the  statistical  dimension  estimate  .5,  for  M  =  40  (solid)  and  A/  =  20  (open).  Bars  are  placed  to  the  left  of  the 
abscissa  point  for  M  =  20,  and  to  the  right  for  M  =  40.  (a)  PI.  (b)  P2,  (c)  P.4  and  (d)  P4. 
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with  100  realizations  of  the  various  test  pro¬ 
cesses,  using  A/ =  40  and  M  =  20.  For  PI,  P2, 
and  P3  (figs.  6a-6c),  there  is  but  little  difference 
between  the  distribution  with  M  =  20  and  M  = 
40,  while  the  histogram  peak  is,  as  desired,  at 
S  =  3  or  5  =  4  (cf.  fig.  5).  Only  for  P2,  the 
noisiest  process,  do  many  realizations  actually 
give  5  =  2.  For  P3,  most  of  the  values  are  5  =  4; 
therefore,  both  oscillations  are  recognized  as 
significant,  even  though  the  noise  has  the  same 
variance  as  the  oscillation  of  period  7.  For  P4 
(fig.  6d),  the  peak  lies  at  5  =  10  for  M  =  20  and 
at  5=18  for  Af  =  40  (i.e.,  S-\M,  like  the 
p-value  for  optimal  noise  reduction). 

For  the  IPCC  data  set,  the  estimates  of  5 
vary  -  with  the  ending  year  -  from  14  to  22  (most 
of  the  values  equaling  18)  for  M  =  40,  and  from 
7  to  12  for  M  =  20.  5  tends  to  increase  towards 
the  end  of  the  time  series,  probably  due  to  the 
increasing  amount  of  data.  The  estimate  of  5  is 
therefore  almost  proportional  to  M,  as  expected 
(cf.  VG),  and  otherwise  well  behaved.  The  pres¬ 
ent  values  are  larger  than  the  estimate  5  =  12  of 
Ghil  and  Vautard  [23],  obtained  for  M  =  40,  but 
coming  from  a  different  data  set  [37]  and  using  a 
different  Monte  Carlo  method;  the  earlier  meth¬ 
od  [23]  compared  the  eigenvalue  spectra  (fig.  1 
there)  instead  of  the  autocovariance  function. 
For  the  Jones  et  al.  [37]  data  set  the  present 
method  also  gives  lower  5-values. 

The  characteristics  of  the  noise’s  estimated 
standard  deviation  /3  are  summarized  in  table  1. 


The  estimates  and  8.^  are  stable,  since  their 
standard  deviation  at  given  M  is  about  one  tenth 
of  their  respective  average  values.  These  esti¬ 
mates  are  also  fairly  insensitive  to  the  window 
length,  since  doubling  M  only  changes  the  aver¬ 
age  by  a  few  percent  at  most.  For  PI,  P2,  and 
P3,  and  are  both  nearly  equal  to  the  true 
noise  standard  deviation  a,  all  but  4  out  of  12 
values  in  the  table  being  well  within  their  own 
standard  deviations.  The  error  made  in  estimat¬ 
ing  the  noise  is  not  only  small  -  about  10%  for 
HTs  ^.s)  versus  cr  -  it  is  also  on  the  safe  side. 

The  excess  of  variance  for  the  estimated  noise 
is  due  to  the  fact  that  the  algorithm  tries  to  fit 
R.jX  =  R.^  y  +  R,.w  to  Rj  V,  where  i;  is  a 
white-noise  process.  Since  c_  and  are,  in  fact, 
bounds  on  the  variance  of  v,  and  y  and  w  are 
uncorrelated,  the  excess  arises  from  the  fraction 
of  the  variance  of  y  contained  in  the  frequency 
bands  involved  by  the  filter  R.,  .  It  follows  that, 
while  in  principle  y,  <  o-  <  5^,  the  only  complete¬ 
ly  reliable  inequality  in  practice  is  a  <8^.  The 
closeness  of  individual  reconstructions  to  the 
process  y  can  be  checked  for  the  four  processes 
in  fig.  la-ld.  The  values  of  y,  and  8^  for  the 
IPCC  data  set  are  also  quite  stable  for  the  time 
intervals  under  study. 

Fig.  7a  shows  the  average  and  the  standard 
deviation  of  the  41  reconstructions  R  of  the 
significant  temperature  signal,  based  on  ending 
dates  between  1950  and  1990.  It  is  noteworthy 
that  the  standard  deviation  at  all  times  is  very 


Table  1 

Average  and  standard  deviation  of  the  bounds  y,  and  standard  deviation  of  the 

noise,  as  identified  by  the  reduction  algorithm.  These  values  are  calculated  from  the  100 
synthetic  realizations  -  for  processes  Pl-P4-and  with  the  final  year  moving  from  1950 
to  1990 -for  the  IPCC  data.  Each  column  contains  the  average  (left  number)  and  the 
standard  deviation  (right  number). 


Process 

Af  =  40 

W  =  20 

ys 

5s 

•ys 

5s 

PI 

1.67  ±0.17 

1.78  ±0.18 

1.66±0.21 

1.83  ±0.17 

P2 

2.09  ±0.20 

2.24  ±0.18 

2.07  ±  0.25 

2. .30  ±  0.20 

P3 

0.71  ±0.08 

0.77  ±  0.07 

0.68  ±0.09 

0.79  ±  0.07 

P4 

2.43  ±  0.33 

2.57  ±0.31 

2.29  ±0.40 

2.56  ±0.34 

IPCC 

0.065  ±  0.007 

0.070  ±  0.004 

0.070  ±  0.007 

0.074  ±  0.(K)3 
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Fig.  7.  (a)  Noise-reduced  IPCC  series,  calculated  with  the  41  ending  dates  running  from  1950  to  1990,  for  M  =  40  (solid)  and 
M  =  20  (dashed).  The  standard  deviation  of  the  noise-reduced  data  is  represented  by  dotted  lines  (for  M  =  40  only),  (b)  Same  as 
(a)  for  the  noise  part  only.  Curves  for  M  =  10  and  M  =  40  are  almost  indistinguishable,  for  the  noise  as  well  as  the  signal. 


small  compared  with  the  fluctuations  over  time 
of  the  average.  The  agreement  between  the  fil¬ 
tered  series  for  Af  =  20  and  M  =  40  is  also  quite 
remarkable,  even  at  the  ends,  in  spite  of  the 
dependence  of  5  on  A/  and  on  ending  date. 

In  fig.  7b  are  plotted  the  same  quantities  for 
the  reconstructed  noise  part  R^.x  of  the  signal. 
Again,  the  variability  from  one  estimate  to 
another  is  considerably  smaller  than  the  noise 
variability  itself.  We  conclude  that,  despite  the 
fact  that  individual  eigenelements  in  the  tail  of 
the  expansion  are  not  statistically  stable,  the 
global  reconstruction  process  is  stable  -  as  the 
length  of  the  data  set  as  well  as  the  window 
length  M  is  varied. 

The  Monte  Carlo  method  becomes  rapidly 
untractable  numerically  as  N  increases,  and  the 
above  algorithm  can  be  shown  to  have  a  bias 
towards  large  values  of  S.  Still,  some  experi¬ 
ments  performed  with  yv  =  10  (XX)  on  P4  showed 


that  the  noise-reduction  factor  (fig.  5)  has  a 
minimum  value  below  1  even  when  and 

one  can  identify  5  as  the  order  at  which  this 
minimum  occurs.  For  large  values  of  N,  the 
eigenvalue  spectrum  itself  becomes  much  more 
reliable  and  hence  should  provide  a  solution  to 
this  identification  problem:  as  a  noise  plateau 
emerges  more  clearly  with  larger  N,  the  break 
point  should  give  an  estimate  of  the  best  order 
p  =  S  to  use  for  noise  reduction.  Other  methods, 
like  the  Wiener  filter,  can  also  be  used  when  N  is 
large. 

4.  Interpretation  of  the  eigenelements 

4.1.  Trends  and  nonstationarities 

The  interpretation  of  SSA  results  relies  on  the 
assumption  that  the  process  x  under  study  is 
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Stationary  in  the  weak  sense,  i.e.,  that  the 
second-order  moments  are  invariant  under  trans¬ 
lation,  although  the  Karhunen-Loeve  expansion 
theorem  does  not  require  the  stationarity  of  the 
process  [38].  Individual  realizations  of  a  process 
of  length  N  may  indeed  appear  nonstationary. 
This  happens  typically  when  periods  larger  than 
N  are  present  in  the  system,  even  if  the  process 
is  stationary,  like  in  the  example  of  fig.  3  in 
section  2.1. 

When  several  realizations  are  available,  it  is 
possible  to  check  for  stationarity.  For  instance, 
let  us  assume  that  the  visual  trend  present  in  the 
IPCC  temperature  data  (figs.  2  and  7a)  results 
from  a  natural  climatic  oscillation  with  a  period 
of,  say,  500  years;  then  another  realization  of 
this  process  might  show  a  decreasing  trend. 
However,  when  only  one  realization  is  available, 
it  is  impossible  to  distinguish  between  actual 
trends  or  nonstationarities  and  the  presence  of 
ultra-low  frequencies.  In  practice,  SSA  still 
works  quite  well,  just  as  if  the  stationarity  of  the 
process,  along  with  the  ergodicity  of  the  realiza¬ 
tions,  were  satisfied. 

The  distinction  between  trend  and  stationary 
ultra-low  frequency  can  be  crucial  in  a  given 
application.  For  instance,  if  the  temperature  data 
over  the  last  century  reflect  a  true  trend,  and  this 
trend  is  caused  by  anthropogenic  increases  in 
greenhouse  trace  gases,  such  as  carbon  dioxyde 
(CO2),  then  a  number  of  technological  and 
socio-economic  consequences  follow  [36] .  The 
statistical  significance  of  the  trend  in  the  data 
was  established  by  Kuo  et  al.  [50],  while  the 
causal  role  played  by  CO  2  increases  is  plausible 
but  not  definitively  confirmed. 

From  the  point  of  view  of  studying  the  higher 
frequencies  clearly  manifest  in  a  time  series,  the 
presence  of  either  a  trend  or  an  ultra-low  fre¬ 
quency  is  a  major  impediment.  Various  detrend¬ 
ing  methods,  such  as  prewhitening  [40],  polyno¬ 
mial  fits  [51]  and  spline  fits  [52]  exist.  They  all 
have  some  advantages  and  serious  drawbacks. 
Ghil  and  Vautard  [23]  showed  that  SSA  provides 
an  effective  and  adaptive  method  of  detrending. 


with  little  if  any  undesirable  aliasing.  We  derive 
here  a  systematic  data-adaptive  algorithm  for 
removing  trends  or  ultra-low  frequencies  in  a 
given  data  set. 

The  algorithm  is  based  on  the  same  principle 
as  noise  reduction.  If  the  trend  is  sizable,  it 
should  appear  in  the  first  few  PCs.  Bearing  this 
in  mind,  we  use  the  nonparametric  test  of  Kei.- 
dall  for  global  trend  identification  [29];  consider 
a  sequence  of  values  (a:,,  1  ^  n)  and  count  the 

number  of  pairs  of  indices  (/.  j).  with  /  <  j. 
such  that  x,<x,.  Roughly  speaking,  if  K,  is 
large,  there  is  a  positive  trend  in  the  series,  and 
if  is  small,  there  is  a  negative  trend  in  the 
series.  More  preci.sely.  the  distribution  of  the 
coefficient 


T  = 


4/C. 

nin-l) 


-  1 


(4.1) 


tends  rapidly  to  a  normal  distribution  with  zero 
expectation  and  standard  deviation 


s  = 


2{2n  +  5) 
9n(n  -  1)  ■ 


(4.2) 


The  hypothesis  of  no  trend  is  rejected,  therefore, 
when  the  measured  value  of  t  is  outside  the 
interval  (-1.96s, -1-1.965),  with  a  5%  chance  of 
being  wrong. 

The  test  is  applied  to  the  successive  PCs,  using 
n  =  N  -  M  +  and  we  denote  by  T^  the  first 
order  such  that  the  corresponding  PC  has  no 
significant  trend.  Detrending  is  then  performed 
by  reconstruction  over  the  set  =  { T,,  7,  -l- 
1,...,M}.  Since  the  ends  of  the  time  series 
may  lead  to  artificial  trends,  a  second  Kendall 
test  is  performed  on  the  detrended  series.  If  a 
trend  is  still  detectable,  reconstruction  over  the 
set  .7'  =  { 7,  -I-  1, ....  M}  is  performed  and  again 
tested,  and  so  on.  The  first  order  7*  for  which 
the  reconstructed  series  has  no  significant  trend 
determines  the  order  of  the  detrending  process, 
and -in  obvious  notation -/?,  .a:  is  the  detren¬ 
ded  series. 
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For  the  processes  P1-P4,  T*  was  always  found 
to  be  1.  For  the  IPCC  data  T*  was  found  to  lie 
between  2  and  4,  the  variations  being  mainly  due 
to  changes  in  the  order  of  the  eigenvalues  and  to 
our  restricting  attention  -  for  simplicity  -  to  the 
leading  eigenelements,  as  far  as  the  trend  is 
concerned.  Fig.  8  shows  the  average  estimate  of 
the  IPCC  temperature  trend  RjX.  S'  = 
r*-!},  calculated  with  different  end¬ 
ing  dates,  for  Af  =  40  and  M  =  20,  in  the  same 
format  as  figs.  7a,  7b.  Again,  individual  trend 
estimates  are  almost  identical.  For  M  =  20,  a 
low-frequency  component  is  mixed  in  with  the 
trend,  justifying  the  use  of  larger  windows  by 
Ghil  and  Vautard  [23].  The  detrending  algorithm 
is  quite  stable,  even  at  the  ends  of  the  series,  for 
A/ =  40. 

4.2.  Pairs  of  eigenelements 

In  VG,  we  showed  that  when  a  vigorous - 
albeit  irregular  -  oscillation  is  present,  a  pair  of 
nearly  equal  eigenvalues  stands  out  of  the  spec¬ 
trum  and  that  the  associated  eigenvectors  and 
PCs  are  in  quadrature.  Even  for  a  pure  sinusoid, 
the  two  associated  eigenvalues  are  not  exactly 
equal  (see  eqs.  (2.7d),  (2.7e)  of  VG),  so  in 
practice  it  can  be  difficult  to  tell  statistical  degen¬ 
eracy  from  oscillatory  pairing.  Ghil  and  Mo  [25] 
introduced  therefore  an  ad  hoc  criterion  for  the 
significance  of  quadrature  between  the  PCs  of  a 
pair,  based  on  their  lag  correlation.  The  main 
difficulty  in  this  approach  is  the  lack  of  reliable 
statistical  significance  estimators  of  the  lag  corre¬ 


lation,  especially  when  the  processes  are  nearly 
periodic,  as  the  PCs  are  suspected  of  being. 

We  propose  here  instead  two  natural  criteria 
based  on  the  spectral  properties  of  the  eigenvec¬ 
tors.  The  first  criterion  is  based  on  the  remark 
that  oscillating  pairs  of  eigenelements  (k.  k  +  \  ) 
must  be  spectrally  localized  around  the  same 
frequency.  The  squares  of  the  reduced  Fourier 
transforms  |£*(/)|"  of  EOFs  k  and  k  4-  1  are 
calculated,  cf.  eq.  (2.6),  at  500  equally  sampled 
frequencies  /  between  0  and  0.5  epu,  and  the 
frequency  corresponding  to  the  maximum 
value  is  estimated.  Then,  for  orders  k  and  k  -l-  1 
to  represent  an  oscillatory  pair,  the  difference 

-  \fk  ~  fk  +  \\  has  to  be  small.  For  a  pure 
red-noise  process  (see  VG  for  the  analytical  cal¬ 
culation  of  the  EOFs  in  this  case),  one  has 

=  1/2M.  Since  we  want  to  exclude  pairs  for 
this  type  of  process,  we  impose  the  criterion 
2A/8/^<0.75. 

Although  necessary,  this  criterion  is  not  suffi¬ 
cient,  since  the  amplitude  of  the  peaks  must  also 
be  high.  In  fact,  if  the  presumed  oscillatory  pair 
completely  resolves  a  frequency  f*  between 
and  /*  +  ,,  the  response  function  p*(/*)  + 
Pkciif*)  of  the  reconstruction  filter  based  on 
components  k  and  k  +  l,  given  by  eq.  (2.20), 
must  be  close  to  1  at  this  frequency,  cf.  eq. 
(2.14).  Therefore,  the  maximum  value  of 

=  /c-l-l},is  calculated  and  the  pair  is  kept 
as  an  oscillatory  pair  only  when  this  maximum  is 
larger  than  |,  i.e.,  only  if  at  least  two  thirds  of 
the  variance  of  x  at  the  peak  frequency  /*  is 
described  by  the  pair  in  question. 

These  two  criteria  are  applied  to  the  realiza¬ 
tions  of  the  four  synthetic  processes.  Results  are 
summarized  in  table  2.  For  M  =  40,  despite  the 
large  variance  of  the  noise  in  P2.  the  algorithm 
still  finds  a  pair  corresponding  to  the  period  20 
for  all  realizations,  whereas  the  period  7  is  iden¬ 
tified  as  a  pair  in  34  realizations  only.  For  PI  and 
P3,  the  algorithm  is  successful  in  general,  with 
pair  1-2  corresponding  to  the  period  20  and  pair 
3-4  to  the  period  7.  Each  realization  in  which 
the  pair  3-4  (and  1-2)  is  not  recovered  corre- 
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Table  2 

Oscillatory  pairs  for  the  four  synthetic  processes  P1-P4,  based  on  1(K)  independent 
realizations,  for  M  =  20  and  M  =  40.  First  row;  average  number  of  pairs  per  realization. 
Second  row;  number  of  realizations  showing  a  pair  for  the  oscillation  with  period  20. 
Third  row;  same  as  second  row  for  the  oscillation  with  period  7.  Last  row;  average 
number  of  spurious  pairs. 


PI 

PI 

P3 

P4 

Af  =  20 

II 

Af  =  20 

II 

Af  =  20 

Af  =  40 

Af  =  20 

Af  =  40 

1.49 

1.68 

1.14 

1.60 

2.14 

2.46 

1.78 

5.51 

100 

100 

92 

100 

too 

UX) 

/ 

/ 

35 

47 

18 

34 

99 

100 

/ 

/ 

0.13 

0.20 

0.04 

0.24 

0.16 

0.44 

1.78 

5.51 

sponds  to  values  of  S  smaller  than  4  (or  2, 
respectively).  The  number  of  spurious  pairs  is 
small  for  PI,  P2  and  P3.  For  M  =  20,  the  number 
of  spurious  pairs  is  slightly  reduced,  as  is  the 
number  of  successful  pairs,  showing  that  our 
criteria  are  not  very  sensitive  to  a  change  in  the 
window  length. 

P4  should  not  be  reducible  to  nearly  periodic 
components,  except  for  the  problems  arising  in 
the  Lorenz  system  from  the  presence  of  embed¬ 
ded  (unstable)  periodic  orbits  of  arbitrary  length, 
on  the  one  hand,  and  from  finite  sample  length, 
on  the  other.  The  average  number  of  detected 
pairs  is  in  fact  high,  due  to  these  problems.  We 
shall  see  in  section  4.4  that  even  the  most  sophis¬ 
ticated  spectral  methods  of  classical  type  find,  for 
these  data,  a  certain  number  of  significant  peaks, 
for  the  same  reasons:  they  are  simply  there  for 
any  finite  segment  of  a  trajectory.  Indeed,  spells 
of  oscillations  occur  when  the  trajectory  spirals 
around  the  unstable  fixed  points,  for  an  ex¬ 
ponentially  distributed  length  of  time,  and  with 
varying  mean  periods.  As  a  consequence,  the 
frequency  of  the  oscillations  detected  varies  with 
realization,  rather  than  being  fixed,  as  it  is  for 
PI,  P2  and  P3  (see  fig.  10  below). 

The  evolution  of  the  pairing  for  the  IPCC  time 
series  was  calculated  as  a  function  of  the  ending 
date  (not  shown).  After  the  components  corre¬ 
sponding  to  the  trend,  an  average  of  about  three 
pairs  are  found  (for  A/ =  40),  associated  with 
rather  stable  frequencies  (see  fig.  11  below).  For 
the  whole  series  (ending  in  1990),  we  find  five 


oscillatory  pairs  with  peak  periods  similar  to 
those  found  in  Ghil  and  Vautard  [23],  i.e.,  26 
years  (pair  3-4),  15  years  (pair  7-8),  10  years 
(pair  5-6),  5.2  years  (pair  9-10)  and  4.6  years 
(pair  16-  il).  Fig.  9  shows  selected  RCs  of  the 
IPCC  series.  The  last  two  pairs  just  mentioned 
describe  most  of  the  variance  with  periods  of  4-6 
years,  associated  with  the  low-frequency  compo¬ 
nent  of  the  ENSO  phenomenon  (see  also  refs. 
[7, 22,  26,  53],  While  the  pairing  itself,  as  a  func¬ 
tion  of  ending  date,  is  not  very  stable  for  the 
IPCC  data  set  (cf.  also  ref.  [24]),  both  the  recon¬ 
structions  (fig.  9)  and  the  dominant  periods  (fig. 
11  below)  are  quite  stable. 

4.3.  SSA  and  MEM  spectral  estimates 

Penland  et  al.  [26]  advocated  the  use  of  SSA 
for  noise  reduction  before  applying  the  maxi¬ 
mum  entropy  method  (MEM)  to  estimate  the 
power  spectrum  of  a  time  series.  They  estimated 
the  power  spectra  of  each  of  the  S  significant  PCs 
with  MEM,  based  on  the  formulation  of  Burg 
[32],  and  truncated  eq.  (2.15)  to  these  PCs  to 
obtain  an  approximation  of  the  power  spectrum 
of  X.  The  generally  regular  behavior  of  the 
PCs  allowed  them  to  obtain  good  spectral  resolu¬ 
tion  with  low-order  AR  models.  The  striking 
advantage  of  this  SSA  prefiltering  is  to  eliminate 
therewith  the  spurious  peaks  inherent  to  high- 
order  MEM  estimates.  We  refine  here  this  ap¬ 
proach  by  deriving  a  fully  consistent  SSA-MEM 
spectral  estimate. 
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Fig.  9.  Reconstructed  subsets  R^x  of  the  IPCC  series  with  ending  date  in  1990  only.  The  global  average  is  removed. 


We  assume  that  the  autocovariance  function 
c(y)  is  known  up  to  lag  7  =  Af-l.  Using  the 
notation  of  eqs.  (2.9),  (2.10),  an  AR  process  z 
of  order  Af  -  1  is  fitted  to  the  time  series  x,  using 
the  Yule  [33]-Walker  [34]  method, 

0(B)  z,  =  u,,  (4.3) 

where  y  is  a  white-noise  process  of  variance  cr* 
and  0( ^ )  a  polynomial  of  degree  M  -\,  with 
real  coefficients,  of  the  complex  variable  The 
PCs  b*  of  the  fitted  process  satisfy 

=  (4.4) 

The  power  spectrum  B*(/)  of  a*  may  thus  be 
approximated  by  the  spectrum  B*(/)  of 

Pkif)=  f  =  exp(2Tri/) ,  (4.5) 

where  cr^  is  the  variance  of  the  noise  process  v. 

In  figs.  lOa-lOd,  the  histograms  of  the  peak 
frequencies  of  the  estimates  (4.5)  of  P*(/)  + 
Pg  +  iif)  are  displayed.  Fifty  bins  between  0  and 
0.5  are  used  for  this  calculation.  For  PI,  P2  and 
P3,  the  spurious  peaks  are  distributed  almost 


uniformly  along  the  frequency  axis.  The  prob¬ 
ability  of  success  in  the  identification  of  the 
oscillation  with  period  20  is  close  to  1  for  all 
three  processes,  whereas  it  varies  for  the  oscilla¬ 
tion  with  period  7,  P2  being  the  worst  case  and 
P3  the  best,  as  expected.  The  spurious  peaks, 
which  are  described  by  the  oscillatory  pairs  for 
P4  (cf.  table  2),  are  concentrated  towards  the 
lower  frequencies,  where  the  actual  power  is 
higher.  These  remarks  hold  both  for  A/  =  40  and 
M  =  20,  although  there  is  somewhat  less  success 
for  the  latter  in  identifying  the  period  7  at  high 
noise  levels  (figs.  10a,  10b).  It  follows  that, 
almost  independently  of  (sufficient)  window 
length,  SSA  separates  oscillations  from  noise 
even  when  the  noise  variance  is  higher  than  that 
of  the  oseillations. 

For  the  IPCC  series,  the  results  are  presented 
in  fig.  11,  as  a  function  of  the  ending  year.  The 
stablest  oscillations  are  those  with  periods 
around  4-5  years,  10  years  and  15  years.  An 
interdecadal  oscillation  with  a  period  of  20-30 
years  is  also  in  evidence  for  about  half  the 
ending  years.  Note  the  similarities  with  the  peaks 
estimated  by  Ghil  and  Vautard  [23] .  using  the 
Jones  et  al.  [37]  time  series  up  to  1988  only. 
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Fig.  10.  Histogram  of  frequencies  associated  with  the  oscillatory  pairs;  frequencies  are  estimated  by  the  maximum  entropy 
method  (MEM),  Fifty  bins  of  width  0.01  have  been  used  to  count  the  number  of  frequencies;  the  place  of  the  bars  for  M  =  20 
(shaded)  and  W  =  40  (solid)  relative  to  the  frequency  bins  is  the  same  as  in  tig.  6.  (a)  PI.  (b)  P2.  (c)  P.4  and  (d)  P4. 

The  theoretical  advantage  of  the  formulation  of  order  p  is  the  partial  sum  of  (2.15)  truncated 
(4.5),  compared  to  that  of  Penland  et  al.  [26],  at  order  p,  i.e., 
where  an  AR  model  is  fitted  to  each  individual 

significant  PC,  is  that  the  additive  property  p 

(2.15)  of  the  spectra  is  conserved  exactly.  It  ^Pif)  -  ^  ' 

justifies  rigorously  the  display  of  stack  spectra  2^ 

of  the  PCs  of  the  process  jc:  the  stack  spectrum  ^  =  exp(2iTr/) .  (4.6) 


1950  I  960  1970  1980  1990 


Ending  year 

Fig.  11.  Periods  associated  with  the  quasi'periodic  compo¬ 
nents  for  the  IPCC  series,  as  a  function  of  the  moving  final 
year. 


Fig.  12  shows  the  stack  spectra  for  the  IPCC 
series  ending  in  1990,  with  p  =  2,  p  =  8,  p  =  5  = 
18,  and  p  =  A/  =  40  (raw  MEM).  The  three  low- 
frequency  oscillations  described  by  the  pairs 
(3, 4),  (5, 6)  and  (7,  8)  have  periods  of  about  26. 
15.  and  10  years.  Most  of  the  variance  associated 
with  PCs  9-to-18  is  related  to  ENSO  oscillations, 
with  periods  between  4  and  6  years.  Note  that 
the  peaks  found  at  periods  4.2  and  3.2  years  are 
rejected  as  noise,  as  well  as  most  of  the  variance 
at  periods  below  4  years.  There  is  remarkable 
agreement  with  the  results  of  Ghil  and  Vautard 
[23],  using  somewhat  different  raw  data  [37]  and 
MTM  (their  fig.  2).  Despite  the  difference  be¬ 
tween  the  data  sets  and  the  method,  we  still 
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Fig.  12.  Stack  MEM  spectra  of  the  IPCC  series  (1861-1990),  for  various  subsets  of  PCs.  together  with  the  raw  MEM  spectrum 
(i.e..  the  stack  1-40);  M  =  40. 


obtain  an  interdecadal  oscillation,  the  ENSO  4-6 
year  activity,  and  the  two  other  periods  (of  10 
and  15  years). 

4.4.  Multi-taper  spectral  estimates 

The  multi-taper  method  (MTM)  -  devised  by 
Thomson  [35],  based  on  the  work  of  Slepian 
[54]  -  is  a  nonparametric  spectral  analysis  meth¬ 
od,  unlike  MEM  which  assumes  an  AR  signal.  It 
provides,  moreover,  an  array  of  statistical  tests 
for  the  spectral  estimates.  MTM  was  applied  to 
various  geophysical  time  series  [55-57]  and  more 
recently  to  several  climatic  time  series,  from 
historic  temperature  data  [23,  50]  through  tree¬ 
ring  data  [58],  to  late  Pleistocene  data  [52,59]. 
We  review  briefly  the  main  profjerties  of  MTM 
(see  also  appendix  A  of  ref.  [52]  for  succint 
details)  and  apply  it  to  the  previously  defined 
synthetic  and  IPCC  data  sets. 

MTM  aims  to  eliminate  the  spectral  bias  in¬ 
duced  by  finite  sampling.  Given  a  line  frequency 
signal  perturbed  by  white  noise,  x,  = 


p.  sin(2Tri/,)  +  w,.  \s  i  <  N.  and  a  frequency 
band  B  =  ( -  W.  +  W).  one  wants  to  mini¬ 
mize  the  power  leakage  of  /,  outside  B.  with  a 
tapered  signal  1  <  /  <  A.  This  goal  is 

achieved  with  a  subset  of  the  ^^=[2^^]  first 
Slepian  sequences,  otherwise  called  discrete  pro¬ 
late  spheroidal  sequences  (DPSSs).  These  se¬ 
quences  (u'**.  1  <  A:  < /C)  are  easily  computed, 
given  W  and  N  [54,  59]. 

The  tapers  are  explicitly  devised  to  optimize 
the  estimate  of  line  frequencies  and  the  corre¬ 
sponding  amplitudes  p..  An  estimate  u  of 
for  known,  is  derived  from  a  least-squares 
regression  with  respect  to  the  coefficients  of  the 
K  tapered  signals,  yielding 


»'(/))  = 


sf.,  VA0)y,ifu) 

n(0)' 


(4.7) 


where  y*(/)  is  the  Fourier  transform  of  the  /cth 
tapered  jrv'*’  and  F*(/)  is  the  Fourier  transform 
of  the  Ath  taper  u'*’.  Notice  that  the  amplitude 
estimate  K/j,)  is  unbiased  when  the  noise  w  is 
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white  [56].  In  practice,  f„  is  not  known  a  priori, 
and  the  estimate  (4.7)  is  computed  for  u{f)  at 
different  /,  to  estimate  the  position  of  the 
maximum,  as  well  as  its  value  K/o)- 
One  can  test  the  validity  of  this  estimate  by 
calculating  the  ratio  of  the  explained  to  the 
unexplained  variance.  It  turns  out  that  the 
variable 


Sf.,|y,(/)-K/)V,(0)l^ 


(4.8) 


follows  a  Fisher-Snedecor  distribution  -  with  2 
and  2K-2  degrees  of  freedom  for  the 
numerator  and  the  denominator,  respectively  - 
when  the  noise  is  white  [56].  Consequently,  one 
rejects  the  null  hypothesis  /a  =  0  (i.e.,  that  the 
series  is  white)  with  a  probability  1  -  ^  of  being 
wrong,  when  the  value  F{  f)  exceeds  a  threshold 
such  that  ^(F  <  F^)  =  q. 

Notice  that  for  a  single,  finite  realization,  the 
maxima  of  v{f)  and  F(/)  will  not  coincide  in 
position.  The  position  of  a  maximum  in  F(/) 
provides  an  unbiased  -  to  first  order  in  the  de¬ 
tuning  /  -  -  estimate  of  a  line  component 

when  the  noise  is  white.  Hence  we  use  the  latter 
to  estimate  the  position  of  the  peaks.  Moreover, 
the  value  of  F  does  not  depend  -  to  first  order  - 
on  the  magnitude  of  ^i,  allowing  the  detection  of 
small-amplitude  oscillations  with  less  ambiguity 
than  traditional  spectral  analysis  methods.  This 
estimate  also  appears  to  be  robust  in  practice  to 
the  slope  of  the  noise  spectrum:  the  usual  white- 
noise  assumption  can  be  replaced  by  colored 
noise,  yielding  a  negative  rather  than  zero  slope 
in  the  frequency  domain,  like  that  of  most 
climatic  time  series  [60]. 

If  the  process  x  is  composed  of  several  lines, 
separated  by  at  least  2W,  the  procedure  still 
applies  in  principle,  due  to  the  rough  independ¬ 
ence  of  the  estimates  at  the  different  lines. 
Nonetheless,  as  we  shall  see,  the  test  (4.8)  de¬ 
tects  more  sharp  lines  than  actually  present  in 
the  signal,  and  the  number  of  spurious  peaks 
appears  to  be  stable  in  our  Monte  Carlo  experi¬ 


ments,  even  though  the  lines  themselves  occur  at 
random  frequencies.  For  a  pure  white  noise, 
F-values  follow  a  Fisher  distribution.  In  theory, 
therefore,  if  q  is  the  probability  of  F  being  less 
than  F^,  the  expected  number  of  F-values  above 
F^  is  ( 1  -  q)N.  It  turns  out  in  our  experiments  on 
pure  white  noise  (not  shown)  that  the  number  of 
peaks  above  F^  depends  in  fact  on  the  bandwidth 
2W  and  the  number  of  tapers  F;  ( 1  -  q)N  ap¬ 
pears  therefore  to  be  only  a  rough  upper  bound 
for  the  number  of  spurious  peaks  MTM  will 
exhibit. 

We  investigate  here  the  effects  of  SSA  noise 
reduction  on  MTM.  Contrary  to  MEM.  there  is 
no  simple  algebraic  link  between  MTM  and 
SSA.  We  have  calculated,  for  the  raw  series  and 
for  the  reconstructed  noise-reduced  series,  the 
MTM  spectral  estimates,  with  WN  =  6  and  K  = 
8.  The  position  of  the  peaks  significant  at  the 
95%  confidence  level  is  displayed  in  histogram 
form  in  figs.  13a- 13d.  Only  the  case  of  noise 
reduction  with  M  =  40  is  discussed,  results  being 
similar  for  M  =  20. 

For  PI,  the  two  lines  appear  for  every  realiza¬ 
tion,  the  period  20  being  distributed  within  two 
histogram  bins  and  the  period  7  within  one  bin 
only  (fig.  13a).  No  significant  peaks  occur  in  the 
vicinity  of  the  two  lines.  The  spectral  power  in  a 
2W  band  around  those  frequencies  is  absorbed 
by  the  lines.  Indeed  -  in  this  case  -W  =  0.04, 
which  corresponds  to  the  gaps  on  each  side  of 
the  line  peaks.  The  frequency  band  beyond 
0.2  cpu  yields  a  roughly  continuous  and  flat 
histogram:  all  frequencies  are  spuriously  de¬ 
tected  in  this  noise  band  with  equal  probability. 
Hence  MTM  is  an  efficient  estimator  of  the  real 
lines  in  this  case,  and  its  spurious  frequencies  are 
uniformly  distributed.  With  a  95%  statistical 
confidence  level,  one  expects  at  most  seven 
peaks  in  the  MTM  spectrum  of  a  white-noise 
process.  For  PI  (fig.  13a)  there  are  four  spurious 
peaks  on  average  above  this  threshold,  each 
peak  being  randomly  distributed  in  the  fre¬ 
quency  band  (0,0.5) -the  two  gaps  around  the 
line  frequencies  excepted. 
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Fig.  13.  Histogram  of  line  frequencies  identified  by  the  MTM’s  F-test  -  at  a  959c  significance  level  -  before  noise  reduction  (solid 
bars)  and  after  (shaded  bars);  M  =  40.  (a)  PI,  (b)  P2,  (c)  P3  and  (d)  P4. 


The  noise  reduction  algorithm  had  a  drastic 
effect  on  the  tail  of  the  MEM  spectrum,  whereas 
little  change  is  observed  for  the  occurrences  of 
the  MTM  line  peaks.  For  PI,  this  procedure 
reduces  the  number  of  spurious  peaks  exhibited 
by  a  factor  of  2  -  uniformly  throughout  the  noisy 
part  of  the  spectrum  -  while  the  power  of  the 
noise  is  divided  by  a  factor  of  10  (not  shown).  If 
we  increase  (P2),  or  decrease  (P3)  the  white- 
noise  variance,  the  MTM  histograms  of  peaks 
are  quite  similar  to  those  for  PI.  Therefore, 
MTM  is  relatively  insensitive  to  the  signal-to- 
noise  ratio,  in  the  case  of  a  quasi-periodic  pro¬ 
cess.  Note  that  the  SSA  estimates  (figs.  10a- lOd) 
count  much  fewer  spurious  peaks  than  MTM. 
The  number  of  SSA  detections  of  the  period  7  is 
unfortunately  also  affected  by  the  noise-reduc¬ 
tion  algorithm,  especially  for  P2  (fig.  10b),  which 
is  not  the  case  of  MTM  estimations.  SSA  esti¬ 
mates  are  thus  more  sensitive  to  the  signal-to- 
noise  ratio,  but  tend  to  produce  fewer  spurious 
peaks;  i.e.,  SSA  is  more  conservative. 

The  application  of  MTM  to  the  process  P4 


indicates  no  preferred  frequency,  proving  the 
robustness  of  the  method.  Indeed,  MTM  consid¬ 
ers  the  Lorenz  system  as  a  colored-noise  process, 
with  no  greater  probability  of  having  a  line  in 
some  band  than  in  another.  SSA  noise  reduction 
filters  out  higher  frequencies,  but  the  histogram 
shape  for  MTM  line  detections  does  not  change 
much  below  0.3  cpu.  SSA  estimates,  on  the  other 
hand,  clearly  detect  peaks  with  a  higher  prob¬ 
ability  at  lower  frequencies.  The  difference  be¬ 
tween  the  two  estimates  is  clear:  SSA  does  not 
try  to  find  lines,  but  rather  spectral  bands  with  a 
high  percentage  of  explained  variance. 

The  detrended  IPCC  series  is  again  analyzed 
with  a  final  date  moving  from  1950  to  1990.  In 
fig.  14,  we  plot  the  periods  of  the  peaks  signifi¬ 
cant  at  the  95%  confidence  level,  in  a  display 
format  similar  to  fig.  11.  For  the  detrended,  but 
still  noisy  data  (reconstruction  of  components 
from  T*  io  M  -  40;  open  circles  in  the  figure),  a 
sequence  of  peaks  at  15  and  9  years  stand  out. 
At  lower  periods,  a  number  of  peaks  occur 
between  2.5  years  and  6  years.  The  SSA  noise- 
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Fig.  14.  Evolution  of  the  periods  identified  by  MTM  -  at  the  95%  significance  level  -  calculated  from  the  RCs  R  ^x.  with 
si  =  (T* , .  .  .  ,40}  (circles),  and  with  si  =  [T* , .  .  .  ,  5}  (plus  signs),  as  a  function  of  the  ending  year. 


reduction  filter  (RCs  T*  to  S\  pluses  in  the 
figure)  essentially  removes  the  two  highest- 
frequency  peaks.  The  peaks  with  periods  of  21, 
15,  9,  6,  4.6,  3.8,  and  2.8  years  all  appear 
significant,  although  the  last  two  have  a  much 
lower  amplitude.  This  is  in  agreement  with  our 
MEM  results  'fig.  12).  Note  that  the  interdecad- 
al  oscillation  with  a  period  of  20-30  years  de¬ 
tected  by  Ghil  and  Vautard  [23]  is  not  very  stable 
at  the  95%  confidence  level.  Still,  this  inter- 
decadal  oscillation  is  significant  at  the  90%  confi¬ 
dence  level  for  every  final  date  (not  shown). 

5.  Application  to  prediction 

Since  the  PCs  are  filtered  versions  of  the  signal 
(cf.  section  3)  and  typically  band-limited  (cf. 
section  4),  their  behavior  is  more  regular  than 
that  of  the  raw  series,  and  hence  more  predict¬ 
able.  This  leads  to  the  heuristic  idea  of  forecast¬ 
ing  only  a  subset  of  PCs,  e.g.,  all  the  significant 
ones  or  the  oscillatory  pairs  exclusively.  Thus 
a  -  hopefully  large  -  fraction  of  the  variance  can 
be  predicted  with  reasonable  skill.  The  forecast 
for  the  entire  series  is  the  sum  of  the  expected 
value  of  the  complementary  PCs,  i.e.,  zero,  and 


of  the  forecast  for  the  selected  subset.  Keppenne 
and  Ghil  [7]  showed,  by  forecasting  the  Southern 
Oscillation  index  (SOI)  of  ENSO  with  consider¬ 
able  skill  out  to  two-and-a-half  years,  that  this 
approach  can  produce  more  accurate  forecasts 
on  an  important  and  well-studied,  but  very  ir¬ 
regular  time  series  than  any  other  method  in 
current  use. 

We  follow  Keppenne  and  Ghil  [7]  and  fit  an 
AR  model  to  each  individual  PC  using  the  AR 
coefficient  estimates  of  Burg  [32],  with  different 
orders  L.  The  order  L  =  M  -  \  ,  which  seems  the 
most  consistent  with  the  SSA  analysis  (cf.  section 
4.3)  was  found  to  be  too  large  compared  with  the 
simple  behavior  of  the  PCs:  indeed,  the  variance 
of  the  AR  coefficient  estimates  increases  with 
the  order  [61].  Forecast  errors  were  minimal,  for 
all  signals  analyzed,  when  the  order  was  quite 
low,  L  10.  For  small  orders,  forecast  errors  are 
dominated  by  the  lack  of  resolution;  for  large 
orders,  by  the  error  made  on  the  coefficients  of 
the  model.  The  results  presented  here  are  for 
L  =  const.  =  10,  and  remain  valid  for  even  smal¬ 
ler  orders. 

Once  forecasts  of  the  individual  PCs  are  pro¬ 
duced,  the  reconstruction  algorithm  is  applied,  in 
order  to  compare  forecasts  with  real  data.  Let  us 
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denote  by  the  subset  of  PCs  used  in  the 
forecast.  If  the  original  series  is  of  length  N,  the 
forecast  for  time  +  t  of  the  sub-signal  in  ques¬ 
tion  is  the  reconstruction  of  the  corresponding 
extrapolated  PCs  a*,  with  1  <  /  <  N  —  M  -i-  t. 
Note  that  this  forecast  for  time  N  +  t  differs 
from  the  value  at  time  N  +  t  produced  by  a 
forecast  for  time  N  -I-  t',  with  t'  >  t.  Indeed,  the 
forecast  for  time  N  +  t  is  the  end  of  the  recon¬ 
structed  series,  whereas  the  reconstruction  pro¬ 
cess  takes  into  account  PC  forecasts  out  into  the 
future  in  the  latter  case.  The  validation  of  the 
forecasts  is  made  by  applying  the  exact  same 
procedure  to  the  actual  PCs  as  to  the  extrapo¬ 
lated  PCs. 

In  fig.  15,  we  plot  -  as  a  function  of  lead  time 
T  -  the  average  forecast  error  for  different  sub¬ 
sets  The  dotted  curve  represents  the  average, 
over  the  100  realizations  of  the  four  synthetic 
processes,  of  the  ratio 


+  .  ,5.1, 

1 

where  -I-  t)  is  the  MEM  forecast  at  time 

N  +  T,  using  the  raw  series  x,  with  L  =  39;  y{N  -l- 
t)  is  the  extrapolated  clean  signal,  calculated 
from  formula  (1.5)  for  PI,  P2,  and  P3.  and  with 
the  Lorenz  equations  for  P4.  In  eq.  (5.1),  «,  is 
the  measured  variance  of  y.  With  this  normaliza¬ 
tion,  the  perfect  forecast  has  an  error  of  0.  and 
the  worst  an  error  of  1.  corresponding  to  the 
■‘climatological'’  forecast,  i.e.,  using  the  past  av¬ 
erage  as  the  future  forecast.  For  PI,  P2  and  P3, 
these  forecast  errors  are  smaller  for  longer  lead 
times  -  around  t  =  40 -before  increasing  again. 
This  results  from  the  fact  that  the  spurious  oscil¬ 
lations  generated  by  noise  have  an  e-folding  time 
shorter  than  the  significant  ones.  The  forecast  is 
therefore  polluted  by  the  effect  of  noise  on  the 
AR  coefficient  estimates. 


Fig.  15.  Average  errors  of  the  SSA-MEM  forecasts  for  the  RCs  R^x.  normalized  by  the  variance  of  the  RCs  of  the  clean  signal 

(solid  curves).  Light  lines  stand  for  jj#  =  .5^  =  { 1 . 5).  and  heavy  lines  for  si  being  the  union  of  the  significant  oscillating  pairs. 

The  dotted  line  represents  the  average  error  of  the  standard  MEM  forecast  of  the  raw  series,  using  the  AR  order  L  =  39.  (a)  PI . 
(b)  P2.  (c)  P3  and  (d)  P4. 
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The  two  solid  curves  show,  for  two  subsets  sd, 
the  average  of  the  ratio 

..  .  [FAN  +  T)-R^y{N  +  T)f 

t)  = - - -  (5.2) 

^2 

where  now  U2  is  the  measured  variance  of  R^y, 
while  F^{N  +  t)  is  the  forecast  using  the  above 
algorithm  with  ,s^  =  y=  {l,...,5}  (5-forecast: 
light  solid)  and  j;/  =  2:^,  U  2^, 
quasi-periodic  pairs  defined  in  section  4.2 
(Q-forecast:  heavy  solid).  Thus,  r^isi,  )  repre¬ 
sents  the  average  forecast  error  made  with  re¬ 
spect  to  the  corresponding  reconstructed  part  of 
the  (clean)  signal. 

Comparison  of  ihe  solid  curves  in  figs.  15a- 15c 
with  the  dotted  one  shows  that  all  forecasts 
produced  by  AR  extrapolation  of  the  PCs  are 
much  better,  at  all  lags,  than  forecasts  performed 
on  the  raw  signal.  For  the  three  processes  PI¬ 
PS,  the  error  is  reduced  by  a  factor  of  3  at  the 
beginning  and  the  PC  forecasts  are  good  for 
about  100  time  units,  i.e.,  five  times  the  longest 
deterministic  period  involved.  Raw  MEM  fore¬ 
cast  errors  grow  more  rapidly  at  longer  lags  than 
PC  forecasts.  Forecasts  errors  based  on  the  oscil¬ 
latory  pairs  only  ( Q-forecasts)  are  smaller  than 
forecasts  based  on  all  the  significant  components 
(5- forecasts). 

For  the  Lorenz  equations  (fig.  15d),  PC-based 
forecasts  are  not  better  than  raw  MEM  forecasts, 
when  compared  with  the  clean  signal.  After  two 
time  units,  forecast  errors  are  greater  than  1, 
indicating  a  total  loss  of  predictive  skill.  When 
only  oscillatory  PCs  are  considered,  a  predic¬ 
tability  limit  of  seven  time  units  is  found,  corre¬ 
sponding  to  the  average  period  of  spiral  motion 
around  the  unstable  fixed  points.  The  (intermit¬ 
tently)  oscillatory  components  describe,  in  gen¬ 
eral,  more  than  50%  of  the  total  variance  of  the 
finite-length  Lorenz  signal. 

These  results  leave  us  with  the  hope  that,  for 
signals  with  intermittent  oscillations,  a  certain 
fraction  of  the  variance  can  be  forecast  by  linear 
models.  When  the  oscillations  are  sustained,  the 


predictability  limit  should  be  pushed  even  fur¬ 
ther,  as  shown  by  the  results  of  PI.  P2  and  P3. 
The  results  are  fairly  stable  to  a  change  in  win¬ 
dow  length,  since  for  M  -  20  (not  shown),  the 
behavior  of  the  forecast  errors  is  the  same,  with 
somewhat  lesser  skill. 

For  the  IPCC  data,  forecast  skill  is  established 
in  a  different  way.  For  each  year  N^.,  starting  in 
1950,  the  AR  coefficients  of  the  PCs  are  calcu¬ 
lated  with  SSA  from  the  training  period  (1861, 
A/y).  Then,  the  AR  models  for  the  PCs  are  used 
to  perform  a  series  of  forecasts  starting  at  years 
+  1, .  .  .,  1989.  The  forecasts  are  com¬ 
pared  to  the  actual  reconstructions,  and  the  error 
is  compared  with  the  erior  mar",  by  the 
climatological  forecast  (CF  hereafter),  i.e.,  the 
extrapolation  of  the  PCs  with  the  value  0,  and  by 
the  persistence  forecast  (PF  hereafter),  i.e..  the 
extrapolation  of  the  PCs  with  the  last  known 
value;  CF  and  PF  are  standard  benchmarks  fer 
skill  in  numerical  weather  prediction  and  its 
extensions. 

The  average  forecast  error  at  lead  time  t  is 
given  by  the  average  of  the  quantity 

ef{N^,  si,  t,  t)  =  [F-A/iy  +  r  +  t) 

-R,,x(Ny  +  i  +  T)]-  (5.3) 

with  respect  to  the  initial  years  -l- 1.  The  CF 
error  is  the  average  of  the  quantity 

ec(Ay,  sv,  t,  t)  =  [/?,yx(A^  -I-  l  +  t)]’  ,  (5.41 

and  the  PF  error  is  the  average  of 

ep(Ay,  si,  t,  t) 

=  [R.^xiNy  +  t)-  R,yX{Ny  +  t  +  t)]'  .  (5.5) 

Note  that  for  the  IPCC  time  series,  we  only  have 
at  our  disposal  the  raw  data  to  compare  with,  not 
the  clean  signal.  This  is  why,  in  eqs.  (5.3)-(5.5), 
R.^x  stands  where  R,^y  does  in  eqs.  (5.1),  (5.2). 
The  forecast  errors  with  respect  to  the  clean 
signal  should  obviously  be  lower. 
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In  table  3  appear  the  statistics  on  the  ratio  of 
the  average  forecast  error  to  both  the  average 
CF  and  PF  errors,  for  lead  times  of  t  =  1-10 
years,  and  M  =  40.  Only  values  ranging  from 
1950  to  1979  were  considered,  since  after  that, 
average  errors  are  calculated  over  less  than  10 
values  and  loose  their  statistical  significance.  If, 
for  year  N^,  lead  time  t  and  subset  si,  the 
average  of  the  forecast  error  is  lower  than  the 
average  CF  error  (or  average  persistence  fore¬ 
cast  Cp),  the  model  is  successful  for  those  values 
of  the  parameters.  In  table  3,  we  count  the 
number  of  successes  of  the  PC-based  AR  model, 
as  well  as  the  number  of  failures.  Italic  charac¬ 
ters  stand  for  cases  when  the  number  of  succes¬ 
ses  is  larger  than  the  number  of  failures.  These 
numbers  are  estimated  for  the  subsets  si  = 
{T*, .  .  .  ,  5},  representing  the  trendless  signifi¬ 
cant  components,  as  well  as  for  the  subset  si 
including  only  the  quasi-periodic  trendless  com¬ 
ponents  (5-  and  ^-forecasts,  respectively). 

Table  3 

The  left/ right  element  in  each  column  is  the  number  of  initial 
years  /V^,  1950  s  s  1979,  having  an  average  forecast  error 
lower/higher  than  the  climatological  forecast  (CLI)  or  the 
persistence  forecast  (PER).  The  results  are  calculated  with 
Af  =  40  for  the  IPCC  series.  5  stands  for  forecasts  of  the 
significant  components,  i.e,,  from  order  T*  to  5.  Q  stands  for 
the  quasi-periodic  components,  i.e..  for  the  union  of  pairs 
lying  in  the  interval  (T*.  S).  The  rows  indicate  the  lead  time 
T.  in  years.  Italic  characters  indicate  that  the  SSA-MEM 
forecast  is  more  successful  than  the  comparison  forecast.  The 
sum  of  the  elements  in  each  cell  is  the  number  of  initial 
years,  30  for  the  5-forecasts  and  29  only  for  the  Q-forecasts  - 
since  for  year  1970,  no  oscillatory  pair  was  detected  (see  also 
fig.  11). 


T 

5/CLI 

5/ PER 

0/CLI 

QIVER 

1 

30/0 

30/0 

29/0 

29/0 

2 

29/1 

24/6 

29/0 

29/0 

3 

23/7 

19/11 

29/0 

28/1 

4 

23/7 

19/11 

26/3 

26/3 

5 

24/6 

14/16 

26/3 

25/4 

6 

25/5 

7/23 

23/6 

24/5 

7 

23/7 

8/22 

19/10 

26/3 

8 

20/ W 
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18/11 

26/3 

9 

19/11 

7/23 

15/14 

21/8 

10 

22/S 

9/21 

16/13 

21/8 

Our  5-forecasts  arc  better  than  CFs  at  all  lead 
times  and  better  than  PFs  up  to  4  years.  Thus 
PFs  are  harder  to  beat  than  CFs.  Careful  exami¬ 
nation  shows  that  the  CFs  are,  in  fact,  bad 
forecasts,  because  the  “detrended"  time  series 
has  still  a  small  positive  trend  toward  the  end. 
All  temperature  forecasts  produced  were  slightly 
below  the  observed.  Climatological  forecasts,  in 
particular,  lie  below  the  average  of  the  end  of 
the  time  series.  Persistence  forecasts  are  better 
since  the  last  known  value  is  a  better  estimate  for 
a  series  with  a  (small)  trend.  Our  forecasts  decay 
eventually  towards  the  average  (of  the  past 
known  values),  since  there  is  no  noise  forcing  in 
the  AR  forecasts.  The  fact  that  they  remain 
more  successful  than  CFs  indicates,  however, 
that  the  sign  of  the  anomalies  is  well  forecasted. 
The  phase  of  the  oscillations  is  correctly  esti¬ 
mated,  whereas  the  amplitude  is  typically  unde¬ 
restimated. 

The  Q-forecasts  are  better  than  CFs  and  PFs 
for  all  lead  times  considered,  indicating  that 
pairs  do  really  correspond  to  linearly  predictable 
phenomena.  The  price  to  pay  is  that  Q-forecasts 
only  predict -on  average  -  about  15%  of  the 
total  variance,  35%  of  the  trendless  variance, 
and  50%  of  the  trendless  and  noiseless  variance. 
Fig.  16a  shows  the  particularly  good  forecast 
initiated  from  the  year  Ay  =  1953,  for  the  signifi¬ 
cant  components  4-14.  The  forecast  matches  the 
series  up  to  1968,  i.e.,  15  years  ahead,  quite 
well. 

Fig.  16b  shows  the  global  temperature  forecast 
to  the  end  of  the  century.  Global  temperatures 
should  decrease  by  about  0.2°C  up  to  1995-1996 
before  increasing  again.  For  a  complete  forecast, 
the  trend  has  to  be  added  back  in.  If  the  increase 
of  temperature  given  by  the  trend  does  not 
change  (about  0.08°  for  the  last  five  years,  cf.  fig. 
9),  we  should  witness  a  decrease  of  about  0.1 2°C 
for  1995-1996.  The  global  temperature  would 
still  be  high  compared  with  the  beginning  of  the 
century,  but  would  be  close  to  the  values  for  the 
early  eighties. 
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Fig.  16.  Global  forecasts  of  detrended  temperature,  from 
IPCC  data,  (a)  Based  on  data  prior  to  1953  only,  of  the 
reconstructed  series  R^x,  with  =  {4, . . . ,  14},  together 
with  the  evolution  of  the  actual  quantity  forecasted,  (b)  For 
the  next  ten  years:  heavy  curve:  =  {3, . . . ,  18};  light 
curve:  =  {3, . . . ,  10, 16, 17}. 


6.  Summary  and  discussion 

We  have  reviewed  theoretical  and  algorithmic 
properties  of  singular-spectrum  analysis  (SSA), 
from  the  signal-processing  point  of  view.  This 
method  of  analysis  is  particularly  useful  when 
little  or  no  knowledge  of  the  underlying  dynami¬ 
cal  system  is  available,  and  the  time  series  at 
hand  are  short  and  noisy.  An  outline  of  the  main 
results  follows. 

(1)  Parameter  ranges  are  obtained  for  apply¬ 
ing  SSA  successfully  to  the  analysis  of  short 
signals.  A  particular  method  of  estimating  of  the 
Toeplitz  matrix  is  shown  to  have  little  bias  com¬ 
pared  to  other  estimates.  In  order  to  correctly 
localize  the  spells  of  an  oscillation  of  period  T 
and  typical  life  time  T’,  we  suggest  to  choose  the 
window  length  M  so  that  T  ^  M  ^  T'. 


(2)  An  algorithm  for  the  reconstruction  of 
isolated  components,  or  groups  of  components, 
is  presented.  It  allows  one  to  expand  the  series  x 
under  study  into  a  sum  of  reconstructed  compo¬ 
nents  x'‘,  corresponding  to  the  /:th  eigenelement 
of  the  Toeplitz  matrix,  at  any  given  epoch. 

(3)  We  develop  a  method  for  the  identifica¬ 
tion  of  noisy  and  of  significant  components.  The 
reconstruction  based  on  the  significant  compo¬ 
nents  only  gives  good  and  statistically  stable 
estimates  of  the  clean  signal,  whether  the  noise  is 
white  (as  in  the  synthetic  examples)  or  not  (as  in 
the  IPCC  data).  This  noise-reduction  algorithm 
is  shown  to  be  particularly  efficic't.  fiven  short 
data  sets,  for  processes  with  a  large  quasi- 
periodic  component.  For  purely  chaotic  pro¬ 
cesses  perturbed  by  noise,  the  proposed  noise- 
reduction  algorithm  involves  substantial  reduc¬ 
tion  in  the  signal  as  well. 

(4)  A  method  is  derived  to  recognize  oscilla¬ 
tory  pairs  of  eigenelements.  These  pairs  are 
shown  to  correspond  to  the  main  periodicities  in 
the  data.  The  periodicities  do  not  necessarily 
represent  sustained  harmonic  components,  but 
indicate  at  least  intermittent  periodic  activity  in 
the  time  series  under  study.  SSA  recognizes 
systematically  an  oscillation  in  a  short  signal  (we 
assumed  the  knowledge  of  150  sample  points 
only)  perturbed  by  white  noise  with  variance  as 
large  as  twice  the  variance  of  the  oscillation  in 
question.  We  also  show  that  SSA  can  be  used  for 
detrending  purposes,  even  for  other  than  linear 
trends. 

(5)  Two  advanced  spectral-analysis  methods 
can  be  used  to  good  advantage  in  combination 
with  SSA.  First,  we  derive  a  fully  consistent 
SSA-maximum-entropy-method  (MEM)  for 
power  spectrum  estimates;  this  approach  re¬ 
spects  the  additive  property  of  SSA  response 
filters,  and  allows  one  to  stack  spectral  estimates 
of  the  different  components,  quantifying  there¬ 
with  the  spectral  contribution  of  each  eigenele¬ 
ment.  Second,  when  the  noise  reduction  and 
detrending  algorithms  are  applied  prior  to  multi¬ 
taper  spectral  estimation,  the  number  of  spurious 
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spectral  peaks  is  significantly  reduced.  This 
eliminates  a  major  drawback  of  the  multi-taper 
method  (MTM),  which  otherwise  provides  a  sig¬ 
nificance  test  for  the  line  components  it  detects 
and  is  robust  to  noise  characteristics. 

(6)  The  regular  behavior  of  the  principal  com¬ 
ponents  (PCs)  makes  them  easier  to  forecast 
than  the  complete  signal.  We  show  that  using 
autoregressive  (AR)  models  for  the  significant 
components  increases  the  predictability,  espe¬ 
cially  in  the  case  of  quasi-periodic  signals  embed¬ 
ded  in  noise.  For  chaotic  dynamical  systems  with 
no  particular  frequency  excited,  linear  models  do 
not  perform  very  well.  However,  these  systems 
may  hide  intermittent  oscillations.  Depending  on 
the  variance  explained  by  the  latter,  at  least  a 
fraction  of  the  total  variance  can  be  forecasted. 
For  the  IPCC  temperature  data,  oscillatory  com¬ 
ponents  account  for  about  50%  of  the  significant 
variance,  after  trend  removal,  and  are  predict¬ 
able  more  than  10  years  ahead. 

We  have  focused  in  this  paper  on  single-chan¬ 
nel  analysis.  The  same  calculations  can  be  car¬ 
ried  out  in  multi-channel  problems.  Reconstruc¬ 
tion  of  selected  components  is  done  essentially  in 
the  same  way  [19].  The  noise-reduction  problem 
can  also  be  approached  in  a  similar  manner.  The 
interpretation  of  the  eigenelements  is  different, 
since  in  multi-channel  analysis  the  EOFs  are 
time  sequences  of  “spatial”  patterns  [18,62]. 
Therefore,  when  a  pair  of  eigenelements 
emerges,  it  is  usually  associated  with  nearly 
linear  traveling  waves,  as  in  the  extended  EOFs 
with  fixed,  small  time  window  already  used  in 
meteorology  [62,63]. 

MEM  spectral  estimates  can  be  computed 
using  multi-channel  AR  models  [19].  Linear 
forecasting  of  the  PCs  is  performed  in  the  same 
way  as  in  single-channel  analysis.  An  interesting 
climate  modeling  ^plication  of  multi-channel 
SSA  could  be  to  use  such  simple  linear  models  in 
order  to  forecast  low-frequency  quasi-periodic 
components,  such  as  the  El  Nino-Southern  Os¬ 
cillation  (ENSO),  and  use  this  forecast  in  combi¬ 
nation  with  a  general  circulation  model,  which  - 


in  many  cases  -  does  not  reproduce  such  oscilla¬ 
tions  very  accurately,  but  can  provide  other 
physical  details  of  interest,  related  to  shorter 
periodicities  [64]. 
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The  success  of  noise  reduction  processes  and  forecasting  methods  as  well  as  accurate  calculations  of  fractal  dimensions 
and  Lyapunov  exponents  from  scalar  time  series  depends  strongly  on  the  quality  of  the  reconstructed  phase  space.  We 
present  a  comparison  of  four  methods,  which  simultaneously  estimate  the  two  embedding  parameters  (delay  time  r  and 
sufficiently  large  embedding  dimension  dim^)  for  Takens’  delay  time  coordinates.  Recently  we  introduced  a  global 
geometrical  measure,  the  fill-factor,  and  a  local  dynamical  measure,  the  averaged  integral  local  deformation.  We  will 
discuss  these  methods  briefly  and  compare  them  to  the  topological  wavering-product.  In  addition  a  simple  algorithm  is 
presented,  which  observes  the  spreading  of  trajectories  at  the  transition  from  dim,,  to  dim,  +  1  and  gives  the  opportunity 
to  estimate  the  correlation  entropy  Af,. 

We  applied  these  algorithms  to  experimental  data  series  measured  from  a  rotational  Taylor-Couette  experiment  and 
verified  the  quality  of  the  obtained  embedding  parameters  by  the  estimation  of  correlation  dimensions  for  different 
reconstructed  attractors. 


1.  Introduction 

In  many  experimental  situations  the  informa¬ 
tion  about  the  system  is  contained  in  a  single 
scalar  time  series.  In  the  Taylor-Couette  experi¬ 
ment  e.g.  we  typically  record  one  observable 
which  is  the  radial  or  axial  component  of  the 
local  velocity  of  the  flow.  The  estimation  of  the 
dynamical  variables  like  fractal  dimensions  and 
Lyapunov  exponents  requires  a  proper  recon¬ 
struction  of  an  attractor  in  phase  space,  in  order 
to  extract  all  relevant  components  of  the 
dynamics.  We  will  show  how  an  improper  choice 
of  the  embedding  parameters  can  falsify  the 
results  for  the  dynamical  quantities. 

N.H.  Packard  et  al.  [1]  suggested  the  calcula¬ 
tion  of  successive  derivatives  of  the  scalar  time 
series  as  independent  coordinates  for  a  phase 
space  reconstruction.  A  faster  method  was  pub¬ 
lished  by  F.  Takens  [2]  who  proposed  a  recon¬ 


struction  from  delay  time  coordinates  which  is 
commonly  used  by  almost  all  experimentalists  in 
this  field.  As  a  conclusion  from  his  theorem,  one 
can  estimate  fractal  dimensions  and  Lyapunov 
exponents  of  a  complex  system  containing  many 
variables  (e.g.  the  dimension  of  phase  space  of 
Navier- Stokes  equations  is  infinite),  when  only 
one  is  measured. 

We  briefly  describe  Takens’  method  of  recon¬ 
struction  which  we  use  throughout  this  paper. 
Let  f{f*)  the  scalar  time  series,  where  kE  K  and 
K  ■=  {kE  N„-,  k<Nj^,}.  Ajg,  is  the  number  of 
data  points,  ^  is  the  observable.  State  vectors  in 
the  reconstructed  phase  space  are  given  by 

at.)  \ 

.  (1) 

^(f,  +  T(dimE  -  1))/ 

where  s  E  5,  f,  =  iT,  and  SEff^,,;  v  < /Vj.„ - 
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T/T^{dim^  -  !)}•  dim^  is  the  embedding  dimen¬ 
sion,  is  the  sampling  time  of  the  continuous 
experimental  signal  and  t  is  the  delay  time  which 
is  a  multiple  of  T^.  According  to  Takens’ 
theorem  the  projection  of  the  orbit  of  a  mapping 
on  an  n-dimensional  manifold  onto  a  time-delay 
constructed  orbit  in  is  one  to  one  if 

dim^  2:2/1  -I-  1.  The  delay  time  t  can  be  chosen 
almost  arbitrarily.  This  is  correct  for  time  series 
without  noise,  with  an  infinite  number  of  data 
points  and  no  restriction  in  resolution;  however, 
none  of  these  conditions  holds  for  experimental 
data  sets.  The  substantial  influence  of  the  delay 
time  on  the  estimation  of  dynamical  variables 
from  experimental  time  series  is  shown  in  [3-6] 
and  examples  will  be  shown  below. 

For  practical  purposes  it  is  important  to  know 
the  lowest  embedding  dimension  dim^  which 
fulfils  the  condition  stated  above.  This  holds 
especially  for  algorithms  working  with  matrices 
where  the  number  of  rows  and  columns  depends 
on  dimg,  e.g.  calculation  of  Lyapunov  spectra 
and  noise  reduction  algorithms. 

A  current  method  to  find  the  embedding  pa¬ 
rameters  makes  use  of  the  redundancy  criterion, 
which  measures  a  more  general  dependence  of 
coordinates  than  the  autocorrelation  function. 
A.  Fraser  [7]  derived  this  criterion  from  mutual 
information  analysis  [8].  D.S  Broomhead  and 
G.P.  King  [9]  used  singular-\  due-<  .jcomposition 
of  the  trajectory  matrix  to  obtaai  the  number  of 
nonzero  singular  values,  which  yield  a  suffi¬ 
ciently  large  embedding  dimension.  A  discussion 
of  these  methods  can  be  found  in  [7]. 

Below  we  will  only  describe  algorithms  which 
try  to  yield  an  optimal  delay  time  and  proper 
embedding  dimension  simultaneously  and  can  be 
implemented  even  on  small  computers.  We  ap¬ 
plied  the  wavering-product  algorithm  and  three 
methods  proposed  by  us  to  experimental  data 
series  from  a  rotational  Taylor-Couette  experi¬ 
ment.  The  quality  of  the  obtained  embedding 
parameters  are  verified  by  the  estimation  of  cor¬ 
relation  dimensions  for  different  reconstructed 
attractors. 


2.  Algorithms 


2.1.  Wavering-product 


W.  Liebert  et  al.  [6]  proposed  a  method  which 
is  guided  by  topological  considerations.  They 
focus  on  neighbourhood  relations  of  points  on 
the  attractor:  Inner  points  remain  inner  points 
and  points  within  the  boundary  defining  the 
neighbourhood  remain  boundary  points  at  the 
transition  from  dim^  to  dim,,  -l-  1. 

To  reveal  violation  of  this  property  they  give 
two  expressions: 

dist^,^^,|(/.  yU.dim;  )) 

'  distj,^^.,(/. /(^.dim^  +  1))  ' 


which  is  the  ratio  of  distan.’^s  ’’icasnrcd  in 
Ujdim^  +  i  reference  point  x,  and  the 

point  ,  (the  A:th  neighbour  in  ),  and 

between  the  reference  point  and  the  point 
+  (fhe  /cth  nearest  neighbour  in 
and 


Qz{Gk):  = 


distji„,^(/.  y(^.dimp)) 

disto,^^(/.  /(A:,  dime  -Hi))' 


(3) 


Notice  that  the  distances  in  Q-,(/.  k)  are  calcu¬ 
lated  in  R‘’""t  Fig.  1  shows  a  situation  where  a 
limit  cycle  is  projected  from  R'  to  R'. 


Fig.  I.  Schematic  representation  of  the  calculation  of  the 
wavering-product.  It  is  shown  how  neighbouring  points  of  the 
reference  point  x,  rearrange  by  the  transition  from  R’  to  R'. 
The  distances  used  to  measure  the  degree  of  violation  of  the 
topologieal  properties  are  marked. 
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W.  Liebert  et  al.  define  the  averaged  . 
ering-product  as 

(Vy(diinE,  t))  ■  = 

•og( E  (  n  Qi(i,  k)  Q,ii,  k)]  ) 

(4) 

where  is  the  number  of  neighbouring  points 
and  the  number  of  reference  points.  For 
chaotic  attractors  the  explicit  r-dependence  can 
be  eliminated  by  normalizing  the  wavering-prod¬ 
uct  to  (W(dimE,  t))/t.  A  minimum  of  this  nor¬ 
malized  wavering-product  as  a  function  of  t 
yields  the  optimum  delay  time.  A  sufficiently 
large  embedding  dimension  is  obtained  from  the 
convergency  of  ( W(dimE,  t)) /t  to  zero.  For  a 
detailed  discussion  of  the  wavering-product  see 
[6, 10]. 

2.2.  Fill- factor 

The  fill-factor  method  which  uses  purely 
geometrical  considerations  guarantees  a  maxi¬ 
mum  distance  of  trajectories.  Fig.  2  sketches  the 


basic  idea  of  the  fill-factor  algorithm  using  an 
attractor  measured  from  Taylor-Couette  flow. 
In  fig.  2a  the  trajectories  in  this  two-dimensional 
reconstruction  are  maximally  separated  and  the 
phase  space  is  optimum  utilized  by  dat^  points, 
while  in  fig.  2b  the  attractor  tends  to  collapse.  To 
give  a  measure  of  the  utilization  of  a  dim;  - 
dimensional  phase  space  one  defines  a  volume  of 
a  parallelepiped  spanned  by  dim^  +  1  points  r, 
chosen  arbitrarily,  where  /  =  () . dimj^. 

Let  r,)  a  reference  point  and  calculate  the 
displacement  vectors  d^  =  where  p  = 

1, .  .  .  .dirnp.  For  a  proper  reconstruction  the 

mean  value  of  V'j.n,,  ,„(t)  =  |det(</, . 

is  larger  than  for  an  insufficient  reconstruction  if 
averaged  over  a  number  of  (dimp  +  1  )-tuples 
(r„, .  .  .  ,  Based  on  this  idea  we  define  as  a 

measure  the  fill-factor: 

/dimp(T)-  = 

°^V(max,^;,  -niin*e,.{|(/, )})"""■  ^ 

(5) 

is  the  number  of  chosen  tuples 
(r„, - The  first  maxima  of in 


Fig.  2.  Two-dimensional  schematic  representation  of  the  fill-factor  algorithm.  Left:  a  proper  reconstruction:  trajectories  are 
maximum  separated.  Right:  an  improper  reconstruction:  the  attractor  tends  to  collapse.  r„,  r,  and  r,  are  arbitrarily  chosen  points 
on  the  attractor.  For  the  delay  time  t  =  t,  the  mean  value  of  =  |dct(</,,  rf,)l  is  larger  than  for  t  =  t,  if  averaged  over  a 

sufficiently  large  number  of  triplets  (r„,  r,,  r,). 
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the  interval  0<t<  TJ2  provide  proper  choices 
of  the  delay  time  (T^  is  the  mean  recurrence  time 
of  the  system,  i.e.  the  reciprocal  value  of  the 
dominant  frequency  obtained  from  the  power 
spectrum).  For  increasing  embedding  dimensions 
one  recognizes  that  the  qualitative  structure  of 
the  fill-factor  as  a  function  of  delay  time  is 
preserved  and  the  curves  are  spaced  equidistant- 
ly.  This  behaviour  may  be  used  to  estimate  the 
sufficiently  large  embedding  dimension.  For  a 
detailed  description  of  this  algorithm  see  [5, 11], 

2.3.  Integral  local  deformation 

The  third  method  considered  here  describes 
the  local  dynamical  behaviour  of  points  on  the 
attractor  and  gives  a  measure  of  the  homogenei¬ 
ty  of  the  local  flow.  For  real  physical  systems  it  is 
a  reasonable  assumption  that  in  the  original 
phase  space,  points  on  neighbouring  trajectories 
remain  neighbouring  for  small  evolution  times  t^^ 
and  are  only  separated  exponentially  according 
to  a  positive  Lyapunov  exponent  in  the  chaotic 
case.  We  require  that  this  topological  behaviour 
must  be  preserved  in  a  proper  reconstruction. 
Fig.  3  sketches  the  idea  of  this  method.  It  shows 
an  intersection  of  trajectories  in  a  two-dimen¬ 
sional  embedding.  At  the  intersection  the  flow  is 
ambiguous,  in  contradiction  to  the  physical  un¬ 
derstanding.  The  evolution  of  the  distance 
distdim^. >(■’■)  between  the  reference  point  and 
the  center  of  mass  of  neighbouring  points 
measures  the  “degree”  of  noncausal  behaviour 
of  the  dynamics.  For  the  pictures  shown  below, 
we  calculate  the  absolute  growth  of  successive 
with 

DIST’„^  ,(t)  :=  dist':4'»(T)  -  dist';;l’^./r) 

(6) 

where  q  =  I, . .  . ,  Q  and  =  4. 

To  measure  the  homogeneity  of  the  local  flow  we 
define  the  averaged  integral  local  deformation 
over  reference  points 


Fig.  3.  Two-dimensional  schematic  representation  of  measur¬ 
ing  homogeneity  of  the  liKal  flow.  The  measured  quantity  is 
the  growth  rate  of  the  distance  from  the  reference  point  x,  to 
the  center  of  mass  4,  in  time,  that  is  - 

||x,  -  Aj|.  where  steps  =  t^.J  T^.  The  dashed  segments  of  the 
trajectories  contain  the  neighbouring  points  of  x,  at  j  -t-  steps. 

ElDISTl4,(T)  +  DISTL„,(r)l 

f-\  »  _ _ 

2/V,,,r,(max,e^.{  ))  -  min.^^^.j  ^(/J} ) 

One  finds  proper  delay  times  at  minima  of 
(ILDj,^^(t)),  and  a  sufficiently  large  embedding 
dimension  is  estimated  from  the  convergency  at 
optimal  delay  times  when  dim^  is  increased. 

For  strange  attractors  the  coordinates  become 
more  and  more  uncorrelated  as  t  is  increased  so 
that  the  local  expansion  rates  of  trajectories 
increase.  In  first  approximation  (ILDjj„^(T)) 
shows  a  proportionality  in  t.  So  we  normalize 
(ILDji^^(T))  with  T/7’3.  For  a  comprehensive 
explanation  of  this  method  see  [11]. 

2.4.  Simple  spreading  of  trajectories 

The  last  procedure  also  yields  a  maximum 
separation  of  trajectories.  We  present  a  local 
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measure  which  additionally  provides  the  correla¬ 
tion  entropy  For  each  pair  (t,  dim^)  one 
counts  the  neighbouring  points  of  a  reference 
point  within  a  given  radius  R.  This  is  repeated 
for  a  sufficiently  large  number  of  reference 
points.  With  the  same  set  of  reference  points  this 
procedure  is  done  for  (t,  dirng-l-  1)  so  we  can 
focus  on  the  ratio  :=  <  ,(t))  / 

where  is  the  averaged 

number  of  neighbouring  points.  At  an  optimum 
delay  time,  i.e.  a  maximum  distance  of  trajec¬ 
tories,  ^dim  ^(t)  will  be  small,  due  to  the  loss  of 
many  neighbouring  points  at  the  transition  from 
dim^  to  dim^  -l-  1.  For  embedding  dimensions 
larger  than  the  first  sufficient  one,  the  phase 
space  cannot  be  blown  up  in  all  directions,  i.e. 
no  more  information  can  be  obtained  by  adding 
another  coordinate.  So  the  plot  of  as  a 

function  of  t  will  lose  significant  minima  that 
appear  for  small  embedding  dimensions  and  will 
converge  towards  an  accumulation  line.  The  rel¬ 
evant  information  about  the  delay  time  is  con¬ 
tained  in  Pdi.  ^(t)  when  dim^  is  too  small  and 
dimg  + 1  sufficiently  large.  Here  the  loss  of 
neighbouring  points  is  maximal,  due  to  a  maxi¬ 
mum  separation  of  trajectories,  resulting  in 
minima  in  ^dimj  (t).  For  an  example  we  show  in 
fig.  4  the  calculation  of  the  spreading  of  trajec¬ 
tories  for  the  Duffing  attractor  [12]  (x  -i-  Dx  + 
x  +  =  F  cos  (ot,  where  D  =  0.2,  F  =  40  and 

w  =  1).  The  sufficiently  large  embedding  dimen¬ 
sion  is  dim^  =  4  and  the  relevant  minimum  is 
indicated  by  arrow  A  (t/T^  =  0.07).  If  the  em¬ 
bedding  dimension  is  too  small  then  a  minimum 
value  of  Fdimg(T)  may  give  misleading  results 
(arrow  B,  tIT^^O.25). 

We  may  estimate  the  correlation  entropy  or 
order-two  Kolmogorov  entropy  [13]  from  the 
logarithm  of  versus  delay  time.  Kj  is 

defined  as 


^2  =  -  lim  lim  -  log 

dimg-**  T— ►()  T 


(8) 


^diniE(T)  order-two  correlation  integral 

[14]. 


From  the  definition  of  Fj,,„  (r)  we  find 

j  v.i..,  -  I 

^  M  Kj  S  S  ||x,  -xJIj,,,,^  .,) 

I  II  /-I 

d.a  I  V,,,  I 

S  2  ||x, -x,||j,„  )) 

/  I)  /  1  ^ 

^  ^clim^(^) - .  •  (9) 

C,  (t) 

II  ■  lldim,  denotes  the  maximum  norm  in  dim^- 
dimensional  embedding,  a  is  the  Heaviside  func¬ 
tion  and  is  the  number  of  arbitrarily  chosen 
reference  points.  We  estimate  K,  from  a  least 
squares  fit  of  the  slopes  of  the  curves 
log,(Pdim^(^))  versus  delay  time  for  small  t  and 
higher  embedding  dimensions. 

The  estimated  entropy  is  too  small,  compared 
to  the  largest  Lyapunov  exponent  (A,  =  1.0 bits/ 
orbit).  This  may  be  caused  by  a  too  small  num¬ 
ber  of  data  points  and  an  improper  choice  of  the 
radius  R  =  32768.  R  -  3%  of  the  attractor 
extension).  This  point  needs  further  investi¬ 
gation. 

This  method  is  similar  to  the  one  given  in  [15] 
where  a  minimum  of  the  logarithmic  correlation 
integral  versus  delay  time  is  used  for  an  optimum 
embedding.  With  our  extension  one  may  identify 
whether  the  attractor  is  a  projection  from  a 
higher  dimension,  due  to  the  observation  of  the 
transition  from  dim^  to  dim^:  +  1. 

We  stress  the  point  that  for  chaotic  time  series 
one  has  to  use  the  smallest  delay  time  where 
4„^(t)  ((ILD„„^(t)).  (W(dim,.r)). 

log2(Fdin,^(T)))  has  its  first  maximum  (minimum), 
even  when  the  methods  seem  to  give  better 
values  at  larger  delays.  For  high  embedding  di¬ 
mensions  the  embedding  window  T(dimj.  -  1) 
becomes  so  large  that  the  first  and  last  coordi¬ 
nate  may  be  totally  uncorrelated. 
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A  B 


T  /\ 


Fig.  4.  logj(/’j,„^(T))  for  the  Duffing  attractor.  The  line  indicates  /f,  =  0.78  bits/orbit.  The  value  for  the  metric  entropy  is 
h  =  I  bits/orbit.  Arrows  A  and  B  indicate  proper  and  improper  delay  times,  respectively. 


3.  Comparison  of  methods 

The  goal  of  our  paper  is  the  development  of 
methods  applicable  to  experimental  data.  There¬ 
fore  we  tested  the  methods  with  experimental 
data  measured  in  rotational  Taylor-Couette 
flow.  In  order  to  present  two  different  types  of 
time  series  we  investigated  a  ten-vortex  state 
showing  quasiperiodic  dynamics  and  a  two- 
vortex  state  showing  chaotic  behaviour.  Thf* 
physical  quantity  measured  was  the  axial  velocity 
component  obtained  with  a  Laser-Doppler- 
velocimeter.  For  signal  processing  in  a  digital 
computer  the  analogue  signal  was  converted  with 
a  resolution  of  10  bit.  Details  of  the  experiment 
and  the  data  processing  can  be  found  in  [16-18]. 

3.1.  Quasiperiodic  attractor 

In  order  to  test  the  quality  of  our  results  we 
apply  all  four  methods  to  an  attractor  with  a  well 
known  structure,  a  two-torus.  The  periods  were 
r,  =  1.54  s  and  T2  =  0.22  s,  respectively  with  an 
amplitude  ratio  of  3.6.  We  took  T,  as  in  figs. 


5a-5d.  The  sampling  time  T.^  was  0.02  s  in  this 
case.  The  number  of  reference  points  was  = 
600  and  the  number  of  data  points  was  = 
32768. 

In  fig.  5a-5d  the  four  measures  for  a  proper 
delay  time  and  embedding  dimension  are  shown 
as  a  function  of  the  normalized  delay  time  for 
various  embedding  dimensions.  To  illustrate  the 
significance  of  the  measures  we  plotted  the  Poin¬ 
care  sections  of  the  torus  for  9  different  delay 
times  in  fig.  6.  The  fill-factor's  (t))  first  two 
maxima  be  found  at  t/T^~0.05  and  t/T^,  =  0.09 
in  accordance  with  the  first  two  minima  of  the 
integral  local  deformation  ((ILDji^^(T))),  wav¬ 
ering-product  (( W(dim£,  t)))  and  spreading 
(log2(PdimE('’’)))-  show  in  fig.  6  that  one  gets 
a  well-spanned  torus  when  using  these  delays.  A 
comparison  between  fig.  5  and  fig.  6  shows  the 
significance  of  the  relative  extrema.  We  want  to 
stress  the  point,  that  proper  and  improper  delay 
times  may  be  close  to  each  other  as  can  be  seen 
in  fig.  6  for  t/T^^O.OS  and  t/7^.==0.06,  respec¬ 
tively.  For  the  higher  frequency  the  difference  in 
the  delay  time  of  t/T^  =  0.0\  gives  already  a 


Fig.  5.  (a)  wavering  product  ( ^(dimE,  t)),  (b)  fill-factor /<,j„g(T),  (c)  integral  local  deformation  (ILDj,„j^(t))  and  (d)  spreading 
log2(^<iimE(''’))  reconstructed  two-torus  from  the  Taylor-Couette  experiment.  Arrow  (A)  indicates  a  proper,  arrow  (B) 

indicates  an  improper  delay  time.  Sufficient  large  embedding  dimension:  dim^  =  4.  The  entropy  Af,  =  0  is  given  by  the  slope  of  the 
line  in  (d). 
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\ 

r/T,=0,06 
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T/r,-0,27 

C7 

t/r,=o.i9 


t/T,=0.49 


Fig.  6.  Poincare  sections  for  various  delay  times  for  the 
two-torus. 


significant  phase  shift  and  can  cause  a  collapse  of 
one  coordinate. 

From  the  slope  of  the  accumulation  line  of 
log2(Pdjn,^(T))  versus  delay  time  for  higher  dimj; 
we  calculate  the  correlation  entropy  K,  =  0  bits/ 
orbit,  as  expected  for  a  quasiperiodic  state. 

From  the  convergency  of  the  four  measures  as 
a  function  of  embedding  dimension  one  can  esti¬ 
mate  dim^  =  4  to  be  sufficiently  large. 

For  two  delay  times  identified  as  a  proper 
(arrow  A  in  fig.  5,  t/T^  «0.09)  and  as  an  impro¬ 
per  delay  time  (arrow  B  in  fig.  5,  t/T^  =  0.25)  we 
calculated  the  correlation  dimension  according  to 
Grassberger  and  Procaccia  [14].  The  result  is 
given  in  fig.  7.  Fig.  7a  shows  the  local  slopes 
from  the  double  logarithmic  plot  of  the  correla¬ 
tion  integral  versus  radius  R  for  the  proper  delay 
time.  The  plot  reveals  the  intrinsic  structure  of 
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Fig.  7.  Local  slopes  calculated  from  the  double  logarithmic 
correlation  integral  vs.  radius  R.  The  radius  is  given  in 
percent  of  the  total  extention  of  the  attractor,  (a)  proper 
reconstruction  (t/T^.  =  0.09);  the  local  slopes  reveal  the  struc¬ 
ture  of  the  torus,  (b)  improper  reconstruction  (t/T^.  =  0.25): 
a  relevant  scaling  region  for  an  estimation  of  the  correlation 
dimension  is  not  found. 

the  torus.  D,  converges  for  diiriE  ^  4.  For  the 
improper  delay  time  no  relevant  scaling  region 
for  the  dimension  can  be  found  as  can  be  seen  in 
fig.  7b. 

3.2.  Chaotic  attractor 

As  a  second  example  've  examined  a  two- 
vortex  state  showing  chaotic  dynamics.  The  num¬ 
ber  of  reference  points  was  =  600  and  the 
number  of  data  points  was  ^^3,  =  16384.  The 
sampling  time  'Vas  set  to  =  0.03  s.  Fig.  8 
shows  the  velocity  power  spectrum  and  the  cor¬ 
responding  autocorrelation  function.  The  domi¬ 
nant  frequency  was  1. 1 1  Hz  so  we  took  =  0.9  s 
in  this  example. 

The  fill-factor  as  well  as  the  integral  local 


0  2  4  6  8  10 


frequency  /  Hz 


Fig.  8.  (a)  power  spectrum  and  (b)  autocorrelation  function 
for  a  chaotic  time  series  obtained  from  the  Taylor-Couette 
experiment. 


deformation  yield  proper  delay  times  at  tIT^ 
=  0.17  (indicated  by  arrow  B)  and  improper 
delay  times  at  t/T^~0.03  and  tIT^~0.2  (indi¬ 
cated  by  arrows  A  and  C).  The  first  minimum  of 
(ILDj,„,^(t))  {t/T^  from  0.1  to  0.17)  is  wider 
than  the  corresponding  maximum  of  the  fill- 
factor.  Obviously  the  local  flow  of  the  strange 
attractor  is  smooth  with  maximum  utilization  in 
the  phase  space.  The  sufficiently  large  embed¬ 
ding  dimension  is  found  to  be  dimp  >5.  The 
normalized  wavering-product  (lT(dimp  ,  t)) /t. 
shown  in  fig.  9a,  yields  only  a  less  significant 
minimum  for  dim^:  =  7.  Fig.  9d  shows  the 
spreading  of  trajectories  at  the  transition  from 
dimjr  to  dim,,  +  1  for  embedding  dimensions  1  to 
6.  Although  log,(Pj|n,^  (t))  versus  delay  time  has 
significant  extrema  we  must  say  that  this  method 
fails,  because  one  cannot  estimate  a  sufficiently 
large  embedding  dimension  and  the  minima  of 


Fig.  9.  (a)  normalized  wavering  product,  (b)  fill-factor,  (c)  normalized  integral  local  deformation  and  (d)  spreading  log,(P<„„|^^(T)) 
for  the  reconstructed,  strange  attractor  from  the  Taylor-Couette  experiment.  Arrow  (A)  marks  a  too  small  delay  time  (t  =  T^). 
Arrow  (B)  marks  a  proper  delay  time  (t/T^  =0.17)  and  arrow  (C)  an  improper  delay  time  (t/T,,  =  0.2).  The  normalized  wavering 
product  shows  less  significant  extrema  in  this  region.  The  slope  of  log2(Pdi„|,{T))  is  fitted  for  embedding  dimensions  5  and  6.  This 
rough  estimation  yields  /Cj  =0.9  bits/orbit. 


log2(PdiinE('''))  vary  when  dirng  is  increased.  This 
may  be  caused  by  a  too  small  number  of  data 
points  which  gives  insufficient  statistics  for  higher 
embedding  dimensions.  The  estimated  entropy  is 
=  0.9  bits/orbit. 

For  the  three  delay  times  indicated  by  A,  B 
and  C,  respectively,  in  fig.  9  we  calculated  the 
correlation  dimension.  The  results  are  shown  in 
fig.  10  where  the  local  slopes  are  plotted  as  a 
function  of  log(/?).  As  expected  from  fig.  9  the 
calculation  using  t/T^^O.OS  (arrow  A)  as  delay 
time  does  not  give  any  scaling  region  in  the 
dimension  plot  fig.  10a.  For  the  proper  delay 
time  tIT^^O.11  (arrow  B)  one  finds  a  broad 
scaling  region,  yielding  a  correlation  dimension 
D2=»3.1.  The  scaling  region  obtained  for  the 


delay  time  t/T^  =  0.2  (arrow  C)  becomes  smaller 
but  gives  approximately  the  same  result  for  D.  as 
before. 

3.3.  Noise  and  computer  time  consumption 

Due  to  its  global  character  the  fill-factor  is  the 
measure  which  shows  the  best  robustness  against 
noise.  Furthermore,  the  computer  time  con¬ 
sumption  for  this  algorithm  is  low  because  it 
does  not  need  any  searching  and  sorting  proce¬ 
dures  for  neighbouring  points.  For  a  typical 
range  of  delay  times  and  embedding  dimensions 
we  estimate  the  calculation  of  the  fill-factor  to  be 
3  times  faster  than  the  simple  spreading  al- 
goiithm,  10  times  faster  than  the  averaged  inte- 


136 


Th.  Buzug.  G.  Pfiste'r  /  Calculating  optimal  embedding  parameters 


Fig.  10.  Local  slopes  calculated  from  the  double  logarithmic 
correlation  integral  vs.  logarithmic  radius  R.  The  radius  is 
given  in  percent  of  the  total  extention  of  the  attractor,  (a) 
improper  reconstruction  (t/T,  =  0.03):  the  local  slopes  do 
not  reveal  the  structure  of  the  strange  attractor.  No  scaling 
region  is  obtained,  (b)  proper  reconstruction  (r/T^  =  0.17): 
one  finds  a  broad  scaling  interval  for  an  estimation  of  D,  = 
3.1.  (c)  improper  reconstruction  (t/T^  =  0.2):  the  scaling 
interval  becomes  smaller. 

gral  local  deformation  and  about  20  times  faster 
than  the  averaged  wavering-product.  One  must 
take  this  into  account  when  implementing  the 
algorithms  on  small  laboratory  computers.  We 
used  linearly  sorted  data  set  arrays  to  search  for 
neighbouring  points.  One  may  certainly  save 
computation  time  by  using  more  efficient  search¬ 
ing  algorithms. 


4.  Conclusion 

The  success  of  the  estimation  of  fractal  dimen¬ 
sions  depends  strongly  on  the  quality  of  the 
embedding  parameters.  We  showed  that  even 
adjacent  delay  times  yield  proper  and  improper 
attractor  reconstructions. 

In  this  paper  we  compared  four  algorithms 
which  provide  a  proper  delay  time  for  the  recon¬ 
struction  of  a  phase  space  and  give  an  estimate 
for  a  sufficiently  large  embedding  dimension 
simultaneously.  The  algorithms  were  applied  to 
experimental  time  series  and  the  measures  were 
calculated  for  several  embedding  dimensions  and 
a  wide  range  of  delay  times.  By  calculating  the 
correlation  dimension  of  two  attractors  which 
were  reconstructed  with  proper  and  improper 
delay  times  we  could  verify  a  posteriori  the 
common  sense  criterions  used  for  these  al¬ 
gorithms. 
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This  paper  discusses  some  difficulties  in  estimating  dynamics  from  time-delay  embeddings  of  experimental  data  that 
can  be  characterized  as  low-dimensional.  A  new  procedure  is  described  to  reduce  noise  by  exploiting  the  properties  of 
saddle  periodic  orbits  on  the  reconstructed  attractor. 


1.  Introduction 

There  are  several  different  approaches  to  noise 
reduction  in  time  series  data  whose  underlying 
dynamics  can  be  described  as  low  dimensional. 
Kostelich  and  Yorke  [1,2]  outlined  a  procedure 
that  uses  an  approach  originally  suggested  by 
Eckmann  and  Ruelle  [3]  for  computing  Lyapu¬ 
nov  exponents.  Farmer  and  Sidorowich  [4]  de¬ 
scribe  another  method  for  use  when  the  dynam¬ 
ics  are  known.  Schreiber  and  Grassberger  [S] 
describe  a  simple  time-series  based  approach. 
Cawley  and  Hsu  [6  ]  have  suggested  an  approach 
based  on  projecting  trajectories  onto  planes  that 
locally  approximate  the  manifold  containing  the 
attractor** . 

Most  of  these  methods  involve  the  estimation 
of  a  derivative  from  the  data  or  in  some  way 
require  a  least  squares  estimate  of  the  location 
of  some  portion  of  the  attractor.  This  paper  de¬ 
scribes  some  of  the  problems  inherent  in  the  es¬ 
timation  of  dynamics  from  data,  regardless  of 
the  type  of  model  used  to  approximate  the  dy¬ 
namics.  These  difficulties  may  arise  from  the 

*'  The  reprint  collection  by  Hao  (7]  contains  several  ar¬ 
ticles  on  the  analysis  of  low  dimensional  chaotic  exper¬ 
imental  data. 


fractal  structure  of  the  attractor  and  errors  in 
all  the  observations.  The  problems  persist  re¬ 
gardless  of  the  amount  of  available  data  and  af¬ 
fect  one’s  ability  to  determine  an  accurate  local 
model  of  the  dynamics,  even  when  an  accurate 
model  should  be  obtainable  in  principle.  These 
issues  are  discussed  in  section  2. 

Many  of  these  problems  can  be  circumvented 
by  using  as  much  dynamical  information  as  pos¬ 
sible  in  the  formulation  of  the  statistical  rela¬ 
tionship  between  the  observations.  One  attempt 
to  do  this  involves  the  use  of  recurrent  orbits  to 
derive  an  accurate  linear  model  of  the  dynamics 
in  the  vicinity  of  saddle  periodic  orbits  on  the 
attractor.  Section  3  describes  the  method  and  its 
application  to  two  experimental  data  sets  previ¬ 
ously  analyzed  in  [2]. 


2.  Statistical  estimation  of  dynamical 
information 

A  standard  procedure  in  the  analysis  of  chaotic 
experimental  data  is  to  reconstruct  the  attractor 
using  the  method  of  time  delays  [8].  Let  {5,},^ , 
be  a  time  series  of  N  scalar  observations,  sam¬ 
pled  at  equal  intervals.  The  reconstructed  attrac- 
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tor  consists  of  points  of  the  form 

—  (>S/> ■S/'+Tj  •  •  •  > •^1+ (m— 1  )t )  (i) 

where  m  is  the  embedding  dimension  m  and  t 
is  the  time  delay.  (The  discrete  sampling  means 
that  we  may  still  treat  the  observed  dynamics  as 
a  map,  and  x,  refers  to  the  point  on  the  recon¬ 
structed  attractor  whose  first  coordinate  is  the 
/th  observation  in  the  time  series.)  Takens  [8] 
shows  that  under  suitable  hypotheses,  this  re¬ 
construction  is  equivalent  in  some  sense  to  the 
original  attractor  if  m  is  large  enough.  (Sauer 
et  al.  [9]  consider  the  embedding  problem  in 
greater  generality.  Theiler  [10]  and  Fraser  [13] 
discuss  strategies  for  choosing  m  and  t. ) 

The  simplest  model  for  estimating  the  dynam¬ 
ics  is  a  local  linear  one  suggested  by  Eckmann 
and  Ruelle  [3].  Let  x^f  be  a  reference  point  on 
the  reconstructed  attractor,  and  let  {x,  }"^,  be  a 
collection  of  points  in  a  small  neighborhood  con¬ 
taining  Xref-  Let  {y,}  be  the  corresponding  im¬ 
ages;  i.e.,y,  is  the  point  to  which  x,  maps  at  some 
later  time.  The  “true”  dynamics  are  =  /  (x,  ) 
for  some  unknown  function  /.  In  the  Eckmann- 
Ruelle  approach,  /  is  approximated  as 

/(x)fsAx  +  i,  (2) 

where  Aisanmxm  matrix  and  b  is  an  m-vector. 
Here  A  is  an  estimate  of  the  Jacobian  matrix 
D/  evaluated  at  Xr«f.  This  is  an  accurate  model 
of  the  dynamics  if  all  the  observations  and  their 
images  are  contained  in  sufficiently  small  neigh¬ 
borhoods  of  the  reference  point  and  its  image. 

Estimates  of  A  and  A  can  be  found  by  ordinary 
linear  regression.  Although  the  basic  approach  is 
straightforward,  there  are  many  factors  that  af¬ 
fect  one's  ability  to  estimate  the  map  A  accu¬ 
rately. 

Some  limiting  factors  are  the  result  of  the  dy¬ 
namics.  For  instance,  the  dimension  d  of  the  at¬ 
tractor  cannot  be  too  large,  since  the  number  of 
data  points  needed  to  occupy  a  ball  of  a  fixed 
size  6  around  a  typical  point  on  the  attractor  is 


proportional  to  l/e*^  [14].  The  ball  size  e  can¬ 
not  be  too  large,  because  nonlinearities  tend  to 
grow  with  e,  making  eq.  (2)  a  less  accurate  ap¬ 
proximation  of  the  dynamics  at  the  point  x.  In 
most  cases,  the  maximum  value  of  e  that  yields 
a  good  linear  approximation  depends  on  the  ref¬ 
erence  point. 

These  issues  are  important,  but  they  will  not 
be  considered  further  in  this  paper.  Instead,  we 
are  interested  in  problems  that  limit  one’s  abil¬ 
ity  to  determine  the  dynamics  from  experimen¬ 
tal  data,  even  in  cases  where  local  linear  models 
like  eq.  (2)  in  principle  should  be  good  approx¬ 
imations.  There  are  many  statistical  difficulties 
arising  from  the  presence  of  noise  and  the  frac¬ 
tal  character  of  the  attractor  that  deserve  more 
careful  attention  than  they  have  received  in  the 
dynamics  literature. 

2.  /.  Ill  conditioned  least  squares  models 

Most  low  dimensional  chaotic  attractors  have 
a  fractal,  striated  structure — points  tend  to  form 
a  Cantor  set  of  layers.  The  layers  can  be  difficult 
to  distinguish  because  of  the  limited  resolution 
and  size  of  typical  data  sets;  instead  they  resem¬ 
ble  a  curve.  Figure  1  shows  a  portion  of  the  at¬ 
tractor  reconstructed  from  an  x-coordinate  time 
series  generated  by  the  Henon  map  [15] 

x„+i  =  \-ax^  +y„, 

yn+i  =  Px„  (3) 

with  the  usual  parameters  a  =  1.4,  yS  =  0.3. 

If  we  use  eq.  (2)  to  approximate  the  dynam¬ 
ics  in  this  region  of  the  plane,  then  we  must  find 
the  best  2x2  matrix  that  fits  the  data.  However, 
all  the  observations  lie  nearly  along  a  straight 
line.  In  this  case,  we  should  change  coordinates 
so  that  one  axis  is  parallel  to  the  line  contain¬ 
ing  the  observations.  The  map  stretches  points 
along  this  line  by  an  amount  that  can  be  deter¬ 
mined  readily  from  the  images  of  the  observa¬ 
tions.  The  map  contracts  points  in  the  orthogo- 
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Fig.  1.  A  typical  portion  of  the  Henon  attractor  where  the 
points  are  nearly  coUinear. 

nal  direction,  but  this  cannot  be  quantified  read¬ 
ily  from  these  data.  It  becomes  more  difficult  to 
estimate  the  rate  of  contraction  of  nearby  lay¬ 
ers  of  points  as  the  noise  level  increases,  because 
noise  obscures  the  structure  at  small  scales.  The 
Lorenz  [16]  and  Rossler  [17]  attractors  are  two 
other  familiar  examples  of  attractors  where  sim¬ 
ilar  problems  occur  the  points  contained  in  a 
small  box  around  a  typical  trajectory  tend  to  be 
coplanar. 

This  kind  of  attractor  structure  leads  to  ill  con¬ 
ditioned  least  squares  problems;  i.e.,  the  covari¬ 
ance  matrix  of  the  observations  is  nearly  singu¬ 
lar.  The  concept  is  important  because  the  numer¬ 
ically  computed  solution  of  an  ill  conditioned 
least  squares  problem  has  a  large  relative  error. 
In  this  case,  ill  conditioning  means  that  the  Ja¬ 
cobian  matrix  in  eq.  (2)  cannot  be  estimated 
accurately. 

The  singular  value  decomposition  can  be  used 
to  detect  situations  like  these.  Let  ,  be  the 
collection  of  points  plotted  in  fig.  1 ,  let  x  be  their 
mean,  and  let  AT  be  the  n  x  m  matrix  whose  /th 
rowisx,-x.  (We  assume /I  >  m,  where  w  =  2is 
the  embedding  dimension  for  the  Henon  data.) 
Then  there  exist  an  n  x  m  matrix  U,an  mx  m 
diagonal  matrix  Z,  and  anmxm  matrix  V  such 
that 


(4) 


The  columns  of  U  ana  V  form  an  orthonormal 
basis  for  the  columns  and  rows  of  X,  respec¬ 
tively  [18].  Equation  (4)  is  the  singular  value 
decomposition  of  X  and  can  be  arranged  so  that 
the  diagonal  elements  ct,  of  Z  satisfy  ai>  a2> 

■  >  t^m  >  0.  These  are  the  singular  values  of  X 
and  correspond  to  the  nonnegative  square  roots 
of  the  eigenvalues  of  the  covariance  matrix  X^  X. 

The  singular  value  decomposition  allows  one 
to  determine  whether  the  least-squares  prob¬ 
lem  in  eq.  (2)  is  singular:  if  the  rank  of  X  is  r, 
then  only  the  first  r  singular  values  are  positive. 
There  is  good  numerical  software  for  comput¬ 
ing  the  singular  value  decomposition  of  a  ma¬ 
trix  [18].  However,  the  numerically  computed 
singular  values  are  rarely  zero  due  to  roundoff 
error  (even  for  a  singular  matrix),  but  they  are 
small.  The  condition  number  k  is  defined  as 
the  ratio  ai/a^.  Large  values  of  k  correspond  to 
ill  conditioned  least-squares  problems  in  that 
upper  bounds  for  the  relative  error  of  the  so¬ 
lutions  are  proportional  to  k  and  sometimes 
to  K^.  (The  relevant  theorems  are  technical  and 
will  not  be  stated  here;  consult  [19]  or  [18]  for 
more  details. ) 

A  difficult  question  is  how  to  decide  whether 
a  problem  is  ill  conditioned  when  the  data  in  X 
are  known  only  to  a  finite  accuracy.  Some  cri¬ 
teria  are  given  in  [18]  and  [19].  Numerical  ex¬ 
periments  by  the  author  on  the  Henon  map  and 
laboratory  data  sets  suggest  that  least  squares  so¬ 
lutions  to  eq.  (2)  are  most  accurate  if  the  con¬ 
dition  number  of  the  problem  is  of  order  10  or 
less.  If  the  condition  number  of  the  original  prob¬ 
lem  with  m  coordinates  (corresponding  to  the 
embedding  dimension)  is  larger  than  10,  then 
the  problem  is  solved  with  fewer  coordinates.  In 
other  words,  if  ffi/a^+i  >  10  but  oxfok  <  10 
then  the  least-squares  problem  is  solved  by  first 
projecting  the  rows  of  the  observation  matrix  X 
onto  the  subspace  spanned  by  the  first  k  columns 
of  V.  In  this  case,  the  matrix  A  computed  in 
eq.  (2)  has  rank  k. 

The  choice  of  1 0  as  the  upper  limit  for  the  con¬ 
dition  number  is  heuristic.  A  suitable  value  de- 
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pends  on  the  noise  level  and  the  size  of  the  balls 
containing  the  observations  for  the  least  squares. 
If  the  data  are  noisy  and  the  fractal  character 
of  closely  spaced  trajectories  cannot  be  distin¬ 
guished,  then  10  probably  is  a  good  choice.  In 
this  case,  the  points  are  smeared  along  a  sin¬ 
gle  thick  line,  and  one  cannot  estimate  the  rate 
at  which  nearby  trajectories  are  contracted  to¬ 
gether.  A  small  upper  bound  like  10  for  the  con¬ 
dition  number  usually  prevents  this. 

The  points  in  fig.  1  form  an  ill  conditioned  set 
in  that  the  condition  number  K  =  348.  The  span 
of  the  first  column  of  V  essentially  contains  all 
the  observations.  So  we  project  them  onto  the 
corresponding  one  dimensional  subspace,  com¬ 
pute  an  estimate  of  the  expansion  in  this  direc¬ 
tion,  and  change  coordinates  back  to  get  the  ma¬ 
trix  A  (whose  rank  is  1 )  in  eq.  (2). 

Ill  conditioned  problems  become  more  com¬ 
mon  as  the  embedding  dimension  increases.  For 
example,  the  author  took  a  Henon  map  time  se¬ 
ries  of  65  536  values  to  which  0.1%  noise  had 
been  added  and  embedded  it  in  4  dimensions. 
The  resulting  least  squares  problems  were  ill  con¬ 
ditioned  (i.e.,  the  condition  number  of  the  4- 
dimensional  least  squares  problems  was  lai^er 
than  10)  about  2/3  of  the  time. 

The  dynamics  underlying  the  time  series  are 
also  important.  In  some  cases,  like  the  Lorenz 
flow,  initial  conditions  approach  the  attractor 
very  rapidly,  and  most  of  the  observations  tend 
to  lie  along  a  single  sheet.  When  this  happens, 
it  is  usually  possible  to  obtain  accurate  informa¬ 
tion  only  about  the  expansion  rates  of  points  on 
the  attractor. 

Farmer  and  Sidorowich  [4]  have  advocated  a 
noise  reduction  procedure  that  exploits  informa¬ 
tion  about  the  expanding  and  contracting  direc¬ 
tions  to  locate  a  new,  less  noisy  trajectory  close 
to  the  observed  one.  The  idea  is  to  see  how  a 
small  uncertainty  grows  in  the  expanding  direc¬ 
tion  upon  forward  iteration  of  the  map  and  how 
the  uncertainty  grows  in  the  contracting  direc¬ 
tion  on  backward  iteration.  When  the  underlying 
dynamical  system  is  known  exactly,  their  proce- 


Fig.  2.  A  portion  of  the  Henon  attractor.  Although  the 
points  appear  to  fall  along  a  parabolic  curve,  the  dynamics 
are  well  approximated  by  a  linear  map. 

dure  can  produce  a  trajectory  of  unlimited  accu¬ 
racy.  (A  similar  approach  can  be  used  to  prove 
shadowing  theorems  for  numerically  generated 
trajectories  of  the  Henon  map  [20] ).  However, 
the  method  requires  information  about  the  con¬ 
tracting  as  well  as  the  expanding  directions  at 
each  point  on  the  attractor.  The  above  discussion 
suggests  that  it  may  be  difficult  to  adapt  their 
procedure  to  the  case  where  the  dynamics  must 
be  estimated  from  the  data,  because  accurate  in¬ 
formation  about  the  contracting  directions  may 
be  unobtainable. 

The  singular  value  decomposition  provides  a 
straightforward  solution  to  the  problem  of  ill 
conditioning,  as  long  as  one  is  interested  in  a  lin¬ 
ear  model  of  the  dynamics.  However,  some  au¬ 
thors  have  suggested  the  use  of  quadratic  and 
other  nonlinear  models  for  local  approximations 
of  the  dynamics  [21,22].  Local  nonlinear  mod¬ 
els  complicate  the  question  of  the  best  choice  of 
variables,  and  the  distribution  of  the  observa¬ 
tions  can  be  misleading. 

Consider  the  problem  of  finding  the  best  fit  for 
the  collection  of  points  in  fig.  2  to  their  images. 
Should  the  linear  model  of  eq.  (2)  be  used  with 
two  orthogonal  coordinates,  or  would  it  be  bet¬ 
ter  to  use  a  quadratic  model  in  one  variable,  i.e., 
fit  the  data  with  a  parabola  in  some  suitable  co¬ 
ordinates?  In  fact,  these  data  are  generated  from 
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the  Henon  map  (3),  and  eq.  (2)  provides  an  ex¬ 
cellent  fit.  Despite  the  appearance  of  the  attrac¬ 
tor,  the  dynamics  locally  are  well  described  by  a 
linear  map. 

2.2.  Outliers  and  influential  points 

Figure  3a  shows  a  set  of  points  selected  for  a 
least-squares  fit  of  eq.  (2).  The  data  are  gener¬ 
ated  from  the  Henon  map  (3).  In  this  example, 
the  computer  is  instructed  to  take  the  100  points 
on  the  reconstructed  attractor  closest  to  the  ref¬ 
erence  point  .  Figure  3b  shows  the  same  data 
after  1%  uniformly  distributed  random  noise  is 
added.  All  the  observations  lie  in  a  box  whose 
sides  are  less  than  2.4%  of  the  attractor  extent  in 
length,  so  eq.  (2)  provides  a  good  linear  approx¬ 
imation  of  the  dynamics.  The  condition  num¬ 
ber  of  the  associated  least  squares  problem  is 
about  5,  so  it  is  not  ill  conditioned.  Neverthe¬ 
less,  there  is  a  potentially  serious  problem  in  es¬ 
timating  the  Jacobian  matrix. 

In  this  example,  the  strips  of  points  will  be 
stretched  in  one  direaion  and  pushed  together 
after  one  iteration  of  the  Henon  map,  as  illus¬ 
trated  in  figs.  3c,  d.  The  problem  is  that  the 
three  rightmost  points  in  figs.  3a,  b  are  relatively 
far  removed  from  the  rest.  Here  the  amount  of 
stretching  along  the  strips  is  estimated  using  100 
points,  but  in  the  noisy  example,  the  amount  of 
contraction  is  estimated  using  only  the  rightmost 
3  points. 

These  3  observations  are  examples  of  influen¬ 
tial  points.  In  other  words,  the  determinant  of 
the  2  X  2  matrix  computed  from  the  data  is  par¬ 
ticularly  sensitive  to  small  changes  in  the  3  right¬ 
most  points  in  figs.  3a,  b,  and  its  least  squares  es¬ 
timate  from  noisy  data  is  unreliable  because  the 
sample  is  so  small. 

The  reference  point  is  one  of  the  three  points  on  the 
right-hand  side  of  the  plot  in  fig.  3a,  and  distances 
are  measured  with  the  maximum  norm.  Although  the 
distribution  of  these  points  may  look  peculiar,  entirely 
similar  situations  are  encountered  with  the  use  of  other 
norms. 


Influential  points  may  be  outliers,  arising  from 
relatively  large  “glitches”  in  the  data.  More  fre¬ 
quently,  they  result  from  the  fractal  structure  of 
the  attractor.  Figure  4  is  a  schematic  illustra¬ 
tion  of  a  common  situation.  The  reference  point 
is  located  on  one  of  the  frequently  visited  por¬ 
tions  of  the  attractor,  represented  by  the  closely 
spaced  curves.  (Noise  may  obscure  the  attractor 
structure  at  the  smallest  scales,  so  to  the  obser¬ 
vational  accuracy,  the  closely  spaced  curves  may 
look  like  a  single  thick  one. )  Regardless  of  the 
details  of  the  nearest  neighbor  lookup  strategy, 
the  point  selection  algorithm  often  finds  a  set  of 
points  whose  distribution  is  similar  to  that  en¬ 
closed  in  the  circle.  Suppose  the  map  stretches 
points  along  the  curves  and  squeezes  the  curves 
together  after  one  iteration.  Although  there  may 
be  a  very  good  linear  approximation  of  the  dy¬ 
namics  near  the  reference  point,  the  rate  of  con¬ 
traction  is  estimated  using  only  a  few  points  cho¬ 
sen  from  the  upper  (thin)  curve. 

The  notion  of  an  influential  point  is  a  heuris¬ 
tic  one,  and  there  is  no  formal  statistical  defini¬ 
tion.  Unlike  ill  conditioning,  influential  points 
do  not  necessarily  affect  the  accuracy  of  least 
squares  solutions.  For  example,  the  data  in 
figs.  3a,  c  are  known  to  the  numerical  precision 
of  the  computer.  Here  the  least-squares  estimate 
of  the  determinant  is  accurate,  even  though  the 
three  rightmost  points  are  influential.  The  esti¬ 
mate  is  less  accurate  in  the  noisy  case,  and  the 
influential  points  give  it  a  large  variance. 

For  this  reason,  it  is  a  good  idea  to  check  for 
influential  points,  particularly  when  dealing  with 
noisy  input  data.  (See  [23]  or  [24]  for  extensive 
discussions  of  this  topic. )  Let  .T  be  the  n  x  m  ma¬ 
trix  of  observations  as  described  in  section  2.1. 
The  corresponding  prediction  matrix  is  defined 
as 

P  =  X(X'^X)-^X'^.  (5) 

It  can  be  shown  [24,25]  that  the  ixh  diagonal 
element  pa  of  P  satisfies  0  <  <  1  and  that 

1  Iph  can  be  regarded  as  the  number  of  points  in 
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Fig.  3.  (a)  A  collection  of  100  points  from  a  numerically  generated  Henon  attractor,  (b)  The  same  points  after  1%  uniformly 
distributed  random  noise  has  been  added,  (c),  (d)  The  corresponding  images  of  the  points. 


Fig.  4.  Schematic  illustration  of  how  influential  points  can 
arise  in  least  squares  problems.  The  solid  curves  represent 
portions  of  the  attractor.  The  point  selection  algorithm  picks 
observations  from  within  the  circle.  The  points  on  the  upper 
curve  are  influential  points. 

the  fit  that  determine  the  corresponding  obser¬ 
vation  y,.  Thus  the  rows  in  X  for  which  the  cor¬ 
responding  value  of  Pii  exceeds  some  value,  say 
Pa  >  0.2,  may  be  influential  points  [24,26]. 

However,  it  is  important  to  note  that  not  all 


points  with  a  large  value  of  pu  have  a  dispropor¬ 
tionate  effect  on  the  least-squares  solution,  and 
conversely.  There  is  no  single,  clear-cut  answer 
to  the  question  of  what  should  be  done  with  in¬ 
fluential  points  in  the  context  of  computing  lo¬ 
cal  approximations  of  dynamical  systems.  In  the 
situation  illustrated  in  figs.  3,  the  three  influen¬ 
tial  points  might  be  discarded.  Although  infor¬ 
mation  about  the  rate  of  contraction  is  lost,  it 
may  not  be  readily  determinable  anyway,  partic¬ 
ularly  from  noisy  data.  (The  resulting  Jacobian 
matrix  would  have  rank  1,  as  described  in  sec¬ 
tion  2.1.) 

In  many  statistical  applications,  it  is  unwise 
simply  to  discard  influential  points.  Graphical 
analysis  of  the  data  and  the  residuals  is  generally 
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recommended.  However,  this  is  impractical  in 
applications  like  Lyapunov  exponent  estimation 
and  noise  reduction  in  chaotic  data,  where  thou¬ 
sands  of  multivariate  least-squares  problems  are 
solved  as  the  attractor  is  traversed. 

In  the  case  of  chaotic  data,  however,  one  can 
make  reasonable  assumptions  about  the  process 
that  generates  the  observations — points  tend  to 
expand  in  one  direction  and  contract  in  others. 
The  discarded  influential  points  usually  involve 
only  a  loss  of  information  about  the  local  con¬ 
traction  rate. 

Nevertheless,  it  is  important  to  assess  how 
much  the  influential  points  affect  the  analysis 
of  a  given  data  set,  such  as  the  estimation  of 
the  Lyapunov  exponents.  One  procedure  is  to 
calculate  the  Lyapunov  exponents  first  with¬ 
out  discarding  any  influential  points.  Then  the 
calculation  is  repeated,  perhaps  by  discarding 
influential  points  for  which  the  corresponding 
value  of  Pa  is  particularly  large,  say  p,,  >0.5. 
Next,  some  smaller  values  of  p,,  can  be  tried 
(the  smaller  the  value  of  p,„  the  more  likely  the 
points  are  deemed  to  be  influential).  The  neg¬ 
ative  Lyapunov  exponents  are  the  most  likely 
to  be  affected  by  the  different  rejection  criteria. 
This  procedure  may  yield  a  useful  estimate  of 
the  accuracy  with  which  the  negative  Lyapunov 
exponents  have  been  calculated. 

Influential  points  are  common.  For  example, 
in  the  numerical  experiment  on  the  Henon  map 
described  at  the  end  of  section  2.1,  about  2/3 
of  the  least  squares  problems  had  at  least  one 
influential  point.  In  this  experiment,  a  point  was 
classified  as  influential  if  pa  >  0.25. 

It  should  be  noted  that  the  prediction  matrix  P 
in  eq.  (5)  can  be  computed  inexpensively  from 
the  QR  decomposition  of  X.  See  [  1 8  ]  for  details. 

The  above  analysis  can  be  extended  to  include 
the  image  points p,.  Figure  5  is  a  schematic  illus¬ 
tration  of  the  case  where  a  collection  of  points 
initially  are  close  together  on  the  attractor  but 
diverge  into  two  or  more  groups  later.  This  can 
happen  when  some  trajectories  nearly  cross  each 
other  due  to  a  poor  time-delay  embedding  or  due 


Fig.  5.  Schematic  diagram  of  diverging  trajectories  on  an 
attractor. 

to  the  complex  structure  of  the  attractor.  (For 
example,  this  situation  can  arise  on  the  Lorenz 
attractor  where  trajectories  cross  from  one  lobe 
to  the  other. ) 

One  possibility  is  to  form  the  n  x  2m  aug¬ 
mented  malrix  {X  :  Y)  [24],  The  /th  row  of  this 
matrix  has  as  its  first  m  entries  the  vector  x,  -  x 
and  the  vector  3^,  -  as  the  last  m  entries.  (The 
bar  denotes  the  mean  of  the  corresponding  ob¬ 
servations.)  The  augmented  prediction  matrix 
Px.y  is  the  same  as  in  eq.  (5),  except  that  the 
matrix  {X  \Y)  replaces  X.  A  small  value  for  the 
/th  diagonal  entry  of  Px,y  may  indicate  that  the 
pair  (X/,j',)  is  particularly  influential  in  the  fit, 
perhaps  because  it  is  on  a  diverging  trajectory 
as  described  above.  The  use  of  this  procedure 
may  obviate  the  use  of  a  “global”  and  a  “local” 
embedding  dimension  as  discussed  for  instance 
in  [21]. 

Another  alternative  is  to  replace  least  squares 
estimation  with  a  different  minimizing  function, 
a  topic  that  will  not  be  considered  here.  The  use 
of  so-called  “robust  statistics”  offers  many  inter¬ 
esting  possibilities  for  the  analysis  of  chaotic  ex¬ 
perimental  data.  The  monograph  by  Tong  [27] 
considers  the  application  of  robust  statistics  to 
certain  nonlinear  time  series  models. 
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2.3.  The  errors-in-variables  problem 

Let  us  consider  the  problem  of  fitting  a  straight 
line  to  a  collection  of  data.  The  classical  least- 
squares  problem  assumes  that 

yi  =  axi  +  b  +  ti,  (6) 

where  each  observation  y,  is  a  linear  function 
of  the  independent  variable  x,  ,  and  the  random 
variables  e,  are  normally  distributed  with  mean  0 
and  variance  In  eq.  (6),  it  is  assumed  that 
the  only  error  occurs  in  the  measurement  of  y, — 
the  values  jc,  are  known  exactly. 

However,  this  assumption  does  not  hold  when 
one  deals  with  noisy  input  data.  Instead,  all  the 
observations  are  measured  with  error.  Here  the 
classical  least-squares  problem  is  replaced  by 

Vi  =  a{Xi  +  Si)  +  b  +  €i,  (7) 

the  so<alled  errors  in  variables  model.  Here  <5, 
and  6/  are  independent,  normally  distributed 
random  variables.  It  can  be  shown  that  the  clas¬ 
sical  least-squares  estimate  of  the  slope  is  bi¬ 
ased-,  i.e.,  the  slope  a  is  underestimated  by  an 
amount  that  depends  on  the  variance  of  the  Si 
and  is  independent  of  the  number  of  observa¬ 
tions.  Similar  results  can  be  obtained  in  the 
multivariate  case.  (The  book  by  Fuller  [28] 
contains  a  detailed  discussion  of  this  problem 
and  an  extensive  bibliography. ) 

Each  matrix  in  eq.  (2)  computed  from  noisy 
data  is  inherently  inaccurate  as  long  as  ordinary 
least  squares  is  used  to  estimate  it.  This  may 
be  important  for  example  in  Lyapunov  expo¬ 
nent  calculations.  The  standard  algorithm  [29] 
for  estimating  the  Lyapunov  exponents  involves 
the  construction  of  an  orthonormal  set  of  vec¬ 
tors  (the  Lyapunov  basis).  At  each  iteration,  the 
vectors  are  multiplied  by  the  tangent  map  (here 
the  matrix  A,  estimated  using  a  ball  of  points 
around  the  current  trajectory  point),  and  the  re¬ 
sulting  collection  of  vectors  is  orthonormalized 
again.  The  normalization  coefficients  are  then 


saved  to  get  an  estimate  of  the  Lyapunov  expo¬ 
nents.  (See  [30,31,3]  for  details.) 

This  statistical  bias  may  create  systematic  er¬ 
rors  in  the  estimation  of  the  Lyapunov  expo¬ 
nents.  In  the  one  dimensional  case,  for  example, 
the  Lyapunov  exponent  of  a  chaotic  time  series 
probably  would  be  underestimated  because  the 
slope  of  the  regression  line  is  biased  toward  zero. 
A  similar  effect  may  hold  in  higher  dimensions. 
As  a  numerical  experiment,  the  author  gener¬ 
ated  a  time  series  of  32  768  ^-coordinates  from 
the  Henon  map  (3)  with  the  usual  parameters 
a  =  \.4,  fi  =  0.3.  The  largest  Lyapunov  expo¬ 
nent  2 1  obtained  from  the  numerically  generated 
time  series,  using  a  procedure  similar  to  that  de¬ 
scribed  in  [30],  is  0.608  bits/iteration.  (This  is 
very  close  to  the  value  Ai  =  0.607  bits/iteration 
obtained  by  direct  iteration  of  the  analytically 
determined  tangent  map.)  The  procedure  was 
repeated  after  adding  1%  uniformly  distributed 
random  noise  to  the  original  data.  The  esti¬ 
mated  value  of  A I  for  the  noisy  data  is  0.562 
bits/iteration,  an  underestimate  corresponding 
to  a  relative  error  of  7%. 

Noise  introduces  a  tradeoff  between  a  small 
ball  size  (needed  for  an  accurate  linear  approx¬ 
imation  of  the  dynamics)  and  a  large  variance 
in  the  data  relative  to  the  size  of  the  ball.  The 
maximum  size  of  the  neighborhood  that  allows 
a  good  local  linearization  depends  on  the  par¬ 
ticular  dynamical  system,  but  typically  it  is  not 
larger  than  a  few  percent  of  the  attractor  extent. 
In  this  case,  a  noise  level  of  1%  may  correspond 
to  a  large  relative  variance,  which  can  obscure 
the  dynamics. 

The  above  numerical  experiment  to  estimate 
the  largest  Lyapunov  exponent  was  done  with  a 
data  set  that  is  easy  to  analyze:  there  are  32  000 
data  points;  the  attractor  is  low  dimensional 
(the  pointwise  dimension  [14]  of  the  Henon 
attractor  is  about  1.25);  the  noise  level  is  rel¬ 
atively  small  and  uniform;  and  it  is  possible 
to  obtain  a  good  linearization  in  a  fairly  large 
neighborhood  around  each  point  ( 5%  of  the  at¬ 
tractor  extent  or  so).  Nevertheless,  noise  limits 
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the  accuracy  of  the  Lyapunov  exponent  calcula¬ 
tion.  In  contrast,  laboratory  data  are  more  dif¬ 
ficult  to  handle,  because  they  typically  generate 
higher-dimensional  attractors,  the  time  series 
are  often  relatively  short,  they  may  contain  oc¬ 
casional  large  “glitches,”  and  the  noise  levels 
may  be  higher.  Even  with  very  large  data  sets, 
noise  limits  one’s  ability  to  extract  dynamical 
information  like  Lyapunov  exponents. 


3.  Noise  reduction  using  periodic  orbits 

Several  procedures  for  reducing  noise  have 
been  suggested  in  cases  where  the  time  series 
is  generated  by  a  low  dimensional  dynamical 
system  [2,4,22,5,6].  All  of  them  share  some 
common  features.  Typically  a  time  delay  em¬ 
bedding  is  used  to  reconstruct  the  underlying 
attractor  [8].  A  local  model  of  the  dynamics, 
similar  to  eq.  (2),  is  constructed  in  each  of  a 
set  of  small  neighborhoods  covering  the  attrac¬ 
tor*^  .  The  observed  trajectories  in  these  neigh¬ 
borhoods  are  then  adjusted  slightly  so  that  they 
better  satisfy  some  criterion  which  depends  on 
the  local  approximations  to  the  dynamics. 

Although  these  methods  can  be  effective  in  re¬ 
ducing  noise,  they  have  drawbacks.  For  exam¬ 
ple,  the  local  approximations  to  the  dynamics 
usually  are  determined  by  fitting  the  observed 
points  to  the  model  with  ordinary  least  squares. 
Because  all  the  observations  are  measured  with 
error,  the  ordinary  least  squares  solutions  for  the 
parameters  of  the  model  are  biased,  as  described 
in  section  2.3.  Moreover,  the  adjustment  of  the 
trajectories  may  involve  the  composition  of  long 
sequences  of  the  statistically  determined  maps, 
and  this  may  lead  to  systematic  errors. 

Recent  work  has  shown  that  saddle  periodic 
orbits  govern  the  dynamics  on  typical  low¬ 
dimensional  chaotic  attractors  [32,33].  In  many 
cases,  it  is  possible  to  locate  saddle  orbits  on 

Cawley  and  Hsu  [6]  construct  a  local  projection  onto 

the  anractor  instead  of  a  map. 


attractors  reconstructed  from  experimental  data 
by  finding  recurrent  trajectories.  These  can  be 
used  to  estimate  the  topological  entropy  [34] 
and  other  invariants  associated  with  the  dynam¬ 
ics  [35,36], 

In  this  section  we  describe  how  saddle  peri¬ 
odic  orbits  can  be  used  to  reduce  the  noise  in  ex¬ 
perimental  data.  The  idea  is  to  linearize  around 
the  saddle  orbit  to  find  a  model  that  is  linear  in 
the  parameters  like  eq.  (2).  However,  the  model 
describes  a  functional  relationship  between  each 
recurrent  orbit  and  its  successor.  That  is,  the 
parameters  of  the  map  and  the  adjustments  to 
the  observed  orbits  are  determined  simultane¬ 
ously.  Only  one  map  is  calculated  for  each  recur¬ 
rent  orbit.  Moreover,  the  procedure  operates  di¬ 
rectly  on  sections  of  the  time  series  without  us¬ 
ing  a  time-delay  embedding  of  the  attractor.  Be¬ 
cause  only  one  map  is  calculated  in  order  to  ad¬ 
just  all  the  points  on  each  of  the  trajectories  near 
the  saddle  orbit,  the  procedure  is  computation¬ 
ally  efficient — over  10  times  faster  than  previous 
methods. 

3. 1.  Description  of  the  algorithm 

The  procedure  examines  the  dynamics  near 
saddle  periodic  orbits.  Although  saddle  orbits 
are  unstable,  nearby  initial  conditions  may  be 
pushed  away  relatively  slowly  (this  is  often  true 
of  experimental  data  as  discussed  below).  In 
such  cases,  trajectories  near  the  saddle  orbit 
loop  around  it  almost  periodically  for  a  time. 
We  say  that  x,  is  an  (m,  e )  recurrent  point  if  it 
returns  within  e  of  itself  after  m  iterations  of 
the  map,  i.e.,  HZ'”  (x,)  -  x,j|  <  e. 

In  a  time  delay  embedding,  the  presence  of  an 
(m,c )  recurrent  point  means  that  there  may  be 
consecutive  sections  of  m  time  series  values  that 
are  nearly  the  same.  Figure  6  shows  a  portion  of 
a  time  series  record  from  an  oscillating  chemical 
experiment  analyzed  in  [34],  The  reconstructed 
attractor  has  many  points  that  recur  after  m  = 
375  time  steps.  The  marked  sections  in  fig.  6 
show  some  of  the  respective  pieces  of  the  time 


E.J.  Kostelich  /  Problems  in  estimating  dynamics  from  data 


147 


Xl  Xj  X3  X4  Xj  X6 


Fig.  6.  A  portion  of  a  time  series  from  an  oscillating  chem¬ 
ical  experiment.  The  notation  x,  refers  to  the  ith  section 
of  the  time  series  which  is  nearly  periodic. 

series.  These  sections  appear  to  correspond  to 
trajectories  that  loop  around  a  saddle  periodic 
orbit  on  the  attractor.  Here  xi  marks  the  first 
section  near  the  saddle  orbit,  and  xa,  X3,  etc. 
refer  to  successive  sections  of  m  values. 

We  will  treat  each  section  as  a  vector  of  m 
components.  Each  successive  section  is  a  func¬ 
tion  F  of  the  previous  one.  We  write  x,+i  = 
F(x/),  where  F  represents  the  dynamics  near 
the  saddle  periodic  orbit.  This  is  a  natural  place 
to  consider  the  linearized  map,  as  follows. 

Let  Xf  denote  the  saddle  periodic  orbit.  That 
is,  Xf  represents  the  periodic  sequence  of  m  time 
series  values  that  would  be  generated  if  the  ini¬ 
tial  condition  were  chosen  exactly  on  the  saddle 
periodic  orbit  and  the  sampling  interval  were  ex¬ 
actly  1  /m  times  the  period  of  the  saddle  orbit.  If 
Xi  and  x,>i  are  sufficiently  close  to  Xf,  then 


Xi+\-Xf=  F(x/) -Xf«^(x,-Xf),  (8) 

where  A  is  the  Jacobian  matrix  of  partial  deriva¬ 
tives  of  F  evaluated  at  Xf  [37].  Although  the 
map  F  and  its  derivatives  are  unknown,  they  can 
be  approximated  from  the  data  as  described  in 
section  2,  using  a  model  similar  to  eq.  (2). 

In  the  case  of  experimental  data  representing 
discrete  observations  from  a  flow,  the  recurrence 


time  m  can  be  large.  For  the  data  in  fig.  6,  we 
have  m  =  375.  It  is  impractical  to  compute  an 
m  X  m  approximation  of  the  Jacobian  matrix 
in  this  example,  because  it  requires  an  estimate 
of  140000  parameters  from  a  small  number  of 
recurrent  trajectories. 

There  is  considerable  redundancy  in  the  obser¬ 
vations,  however,  and  the  singular  value  decom¬ 
position  can  be  exploited  to  reduce  the  problem 
to  the  estimation  of  2i  d  x  d  matrix  where  d  « 
m  ** .  Suppose  {x,}^^,  is  a  collection  of  succes¬ 
sive  m-recurrent  trajectories  around  some  sad¬ 
dle  orbit.  Let  x  denote  the  mean  of  these  obser¬ 
vations.  Let  X  be  the  m  x  p  matrix  whose  ith 
column  contains  x,  -  x,  and  consider  its  singu¬ 
lar  value  decomposition  given  in  eq.  (4).  The 
columns  of  U  form  an  orthonormal  basis  for  the 
columns  of  X.  Moreover,  the  sum  of  squares 
i  of  singular  values  equals  the  total 
variance  ctj^,  in  the  observations  x,  [19]. 

We  retain  only  the  first  d  singular  vectors  (the 
first  d  columns  of  C/),  where  d  is  the  smallest 
number  such  that 

d 

Ylo}>Cal^  (9) 

1=1 

and  project  each  column  of  the  observation  ma¬ 
trix  X  onto  the  subspace  spanned  by  them.  Here 
C  is  a  number  between  0  and  I,  representing 
some  fraction  of  the  total  variance.  (In  general, 
one  can  experiment  with  different  values  of  C  to 
determine  the  effects  of  retaining  different  num¬ 
bers  of  singular  vectors.  In  the  results  described 
below,  we  illustrate  the  analysis  with  C  =  0.5 
and  C  =  0.75.)  In  the  new  coordinates,  the  m- 
vector  X,  is  replaced  by  a  (/-vector  v ,,  whose  en¬ 
tries  correspond  to  the  components  of  x,  along 
each  of  the  d  most  significant  singular  vectors. 
The  mean  of  the  original  observations  is  mapped 
to  the  origin.  The  saddle  periodic  point  Xf  in  the 


Broomhead  et  al.  [12]  describe  a  similar  procedure 
for  estimating  the  local  dynamics  in  large  dimensional 
systems. 
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original  coordinates  is  now  a  ^/-vector  in  the 
new  coordinates.  We  let  Pf  =  0  be  the  initial 
guess. 

The  dynamics  near  the  saddle  orbit  are  mod¬ 
eled  as  in  eq.  (8),  with  respect  to  the  basis  given 
by  the  first  d  columns  of  U.  The  saddle  orbit 
corresponding  to  the  fixed  point  is  near  but  not 
exactly  at  the  origin  in  the  new  coordinates,  and 
each  of  the  projected  observations  still  contains 
some  error  because  of  the  noise.  Although  the 
covariance  matrix  for  the  remaining  errors  is  not 
known,  we  assume  it  is  a  multiple  of  the  identity 
matrix.  This  is  equivalent  to  the  following  three 
assumptions:  ( 1 )  the  errors  in  the  vectors  are 
independent  and  identically  distributed;  (2)  the 
variance  of  the  error  in  each  coordinate  of  each 
point  Tj  is  the  same  (this  is  reasonable  since  all 
the  observations  come  from  the  same  time  se¬ 
ries);  (3)  any  correlation  between  the  errors  in 
different  coordinates  is  negligible. 

Following  Schwetlick  and  Tiller  [38],  we  de¬ 
termine  values  A  of  the  Jacobian,  Vf  of  the  fixed 
point,  and  Vi  of  the  observations  in  eq.  (8)  that 
minimize 

S  =  J^w\\Vi-Vi\\^ 

i 

A{Vj-v\)\\  ?  (10) 

J 

It  can  be  shown  that  these  are  unbiased  estimates 
under  mild  additional  hypotheses  [28].  The  first 
summation  in  eq.  (10)  is  over  all  the  observa¬ 
tions  near  the  fixed  point.  (Here  «;  is  a  weight¬ 
ing  parameter  for  the  original  observations.  In 
the  results  reported  below,  we  have  set  lo  =  1.) 
The  second  sum  is  only  between  those  pairs  of 
observations  for  which  one  point  is  the  image  of 
the  other. 

The  sum  of  squares  in  eq.  (10)  is  similar  in 
spirit  to  the  minimization  attempted  in  an  ear¬ 
lier  paper  by  Kostelich  and  Yorke  [2].  The  dif¬ 
ference  here  is  that  the  entries  of  the  matrix  A 
are  not  determined  separately  from  the  adjusted 
observations  Vi.  That  is,  instead  of  determining 


A  from  linear  least  squares  first,  then  adjusting 
the  observations  r  ^  we  treat  both  the  observa¬ 
tions  and  the  entries  of  A  as  unknowns  in  the 
same  nonlinear  minimization  problem.  As  men¬ 
tioned  above,  we  hope  to  get  a  better  (unbiased) 
estimate  of  A,  corresponding  to  the  dynamics 
around  the  saddle  orbit. 

Equation  (10)  involves  a  relatively  small 
number  of  successive  sections  of  the  original 
time  series.  If  the  procedure  is  applied  to  the 
data  in  fig.  6,  for  instance,  the  first  sum  in  eq.  10 
runs  from  i  =.  1  to  6  and  the  second  sum  from 
j  =  1  to  5. 

Schwetlick  and  Tiller  [38]  describe  a  Gauss- 
Newton  method  for  minimizing  S.  It  is  also  pos¬ 
sible  to  proceed  more  simply  and  use  the  Polak- 
Ribiere  conjugate  gradient  method  described 
in  [39].  A  suitable  initial  guess  for  A  usually  can 
be  obtained  by  ordinary  least  squares,  and  we 
set  i>f  =  0  initially.  After  the  points  v ,  are  de¬ 
termined,  we  change  coordinates  back  to  obtain 
the  adjusted  recurrent  orbits. 

We  conclude  this  introduction  with  some  re¬ 
marks  about  finding  recurrent  orbits  from  a 
given  experimental  time  series.  The  basic  pro¬ 
cedure  is  simple.  First  reconstruct  the  attractor 
using  a  time  delay  embedding.  For  each  point 
on  the  attractor,  find  the  neighboring  points  and 
see  whether  they  return  after  some  multiple  of 
m  time  steps.  (Depending  on  the  sampling  time 
and  the  underlying  dynamics,  reasonable  values 
for  m  might  range  from  1  to  1000  or  more.) 
One  must  also  verify  that  successive  sections  of 
the  corresponding  m  values  in  the  original  time 
series  also  remain  close. 

The  size  e  of  the  neighborhoods  must  be  not 
be  chosen  so  large  that  unrelated  trajectories  are 
considered  close  to  the  same  saddle  orbit,  and 
c  cannot  be  so  small  that  recurrent  points  are 
missed.  (For  instance,  noise  might  knock  a  re¬ 
current  point  out  of  a  small  neighborhood  of  the 
saddle  orbit. )  In  the  results  reported  below,  dif¬ 
ferent  values  of  c  have  been  used  to  locate  the  re¬ 
current  orbits,  depending  on  the  data  set.  These 
values  were  chosen  by  trial  and  error  to  avoid 
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the  problems  just  mentioned. 

3.2.  Results  using  experimental  data 

One  difficulty  that  arises  in  the  discussion  of 
various  noise  reduction  schemes  is  how  to  quan¬ 
tify  the  amount  of  noise  removed  from  a  chaotic 
experimental  data  set  when  the  “true”  dynamics 
are  unknown.  The  power  spectra  of  chaotic  data 
have  a  broadband  component,  and  it  is  difficult 
to  determine  how  much  is  attributable  to  the  dy¬ 
namics  and  how  much  comes  from  the  noise. 

For  the  sake  of  comparison,  we  consider  the 
two  Couette-Taylor  data  sets  previously  ana¬ 
lyzed  in  [2].  The  first  of  these  is  degenerate  in 
some  sense  because  it  comes  from  wavy  Taylor 
vortex  flow,  whose  dynamics  are  equivalent  to 
a  stable  limit  cycle  instead  of  a  saddle  orbit. 
However,  in  this  case  power  spectra  do  give  a 
reliable  indication  of  the  noise  level,  since  the 
flow  is  periodic. 

The  data  set  consists  of  32  768  time  series  val¬ 
ues  that  are  measurements  of  one  velocity  com¬ 
ponent  at  equally  spaced  time  intervals  in  wavy 
Taylor  vortex  flow  [2].  Nearly  all  the  observa¬ 
tions  are  recurrent  with  a  recurrence  time  cor¬ 
responding  to  30  time  steps.  (In  general,  the  re¬ 
currence  time  is  not  an  integer  multiple  of  the 
sampling  time,  which  always  creates  some  small 
phase  errors.  In  this  data  set,  the  observations 
can  be  divided  into  5  groups  of  recurrent  orbits — 
each  around  the  same  limit  cycle — ^whose  trajec¬ 
tories  are  nearly  in  phase.) 

For  each  group  of  recurrent  orbits,  the  singu¬ 
lar  value  decomposition  and  the  change  to  sin¬ 
gular  basis  coordinates  are  done  as  described  in 
section  3. 1 .  The  noise  reduction  is  accomplished 
by  minimizing  the  sum  of  squares  in  eq.  (10)  in 
the  subspace  spanned  by  first  d  singular  vectors. 
We  illustrate  the  results  obtained  when  d  =  2 
(accounting  for  about  half  the  total  variance,  i.e., 
C  «  0.5  in  eq.  (9))  and  d  =  5  (corresponding 
to  C«  0.75). 

Figure  7a  shows  the  power  spectrum  of  the  raw 
data.  The  power  spectrum  of  the  data  adjusted 


with  respect  to  only  the  first  2  singular  basis  vec¬ 
tors  (fig.  7b)  is  comparable  to  that  of  the  data  af¬ 
ter  processing  by  the  Kostelich-Yorke  algorithm 
(see  fig.  4  in  [2] ).  A  careful  examination  of  the 
spectra  in  fig.  7a,  b  reveals  a  small  peak  near  0.3 
times  the  Nyquist  frequency  in  the  original  data 
that  is  missing  from  the  processed  data  (this  also 
occurs  in  [2] ).  However,  when  the  first  5  singu¬ 
lar  basis  vectors  are  used,  the  peak  remains  and 
the  noise  floor  rises  only  slightly,  as  illustrated  in 
fig.  7c.  This  suggests  that  the  flow  may  be  weakly 
quasiperiodic.  In  such  cases,  the  present  method 
may  preserve  the  dynamics  more  faithfully  than 
the  procedure  described  in  [2],  provided  that  d 
is  large  enough. 

The  method  is  computationally  efficient,  since 
only  one  map  needs  to  be  computed  for  each 
set  of  recurrent  trajectories.  In  this  example,  the 
minimization  of  eq.  (10)  corresponds  to  the  si¬ 
multaneous  adjustment  of  dozens  of  trajectories 
representing  hundreds  of  time  series  values.  On 
a  Silicon  Graphics  Personal  Iris  workstation,  it 
lakes  about  8  seconds  to  determine  all  the  recur¬ 
rent  orbits  in  the  data  set  and  about  43  seconds 
to  minimize  the  sum  of  squares  in  eq.  (10)  for 
each  of  the  5  groups  of  recurrent  trajectories.  All 
but  1 1 00  of  the  time  series  values  are  adjusted  in 
about  50  seconds  of  computer  time  for  the  case 
d  =  5.  (In  contrast,  the  noise  reduction  method 
described  in  [2]  requires  15  minutes  of  CPU 
time. ) 

We  next  consider  the  application  of  the 
method  to  a  time  series  of  32  768  values  in  a 
weakly  turbulent  Couette-Taylor  flow  [40]. 
(This  is  the  same  data  set  described  in  fig.  5 
of  [2].)  A  careful  examination  of  the  data  set 
suggests  that  almost  3 1  000  of  the  values  lie  on 
a  single  recurrent  orbit  . 

The  singular  value  decomposition  and  the 
change  of  coordinates  to  the  corresponding  ba- 

**  The  recurrence  time  corresponds  to  approximately  1 1 53 
time  steps.  The  search  for  recurrent  orbits  was  done  by 
embedding  the  attractor  in  6  dimensions  and  setting  e 
to  8%  of  the  attractor  extent. 
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Fig.  7.  (a)  Power  spectrum  from  wavy  Taylor  vortex  flow,  (b),  (c)  Power  spectrum  after  noise  reduction  using  the  first  2 
and  the  first  5  singular  basis  vectors.  The  vertical  axis  is  the  base- 10  logarithm  of  the  power  spectral  density;  the  horizontal 
axis  is  in  multiples  of  the  Nyquist  frequency. 


Fig.  8.  Power  spectra  for  the  weakly  chaotic  Taylor-Couette  data  set  described  in  [2].  (a)  Power  spectrum  for  the  raw  data, 
(b),  (c)  Power  spectra  using  the  noise  reduction  procedure  described  in  the  text  with  the  first  2  and  first  S  singular  basis 
vectors,  respectively. 


sis  is  computed  as  described  in  section  3.1.  As 
above,  the  noise  reduction  is  done  using  the  first 
d  =  2  and  d  =  5  singular  vectors  (this  captures 
about  50%  and  75%  of  the  variance  respectively 
in  the  recurrent  trajectories,  like  in  the  periodic 
case). 

Figure  8  shows  the  resulting  power  spectra 
(the  units  on  the  plots  are  the  same  as  those 
in  [40]).  The  corresponding  phase  portraits 
closely  resemble  those  in  fig.  5  of  [2]  and  are 
omitted.  The  results  are  comparable  to  those  ob¬ 
tained  from  the  Kostelich- Yorke  algorithm  [2]. 

The  consistency  of  the  statistical  relationship 
can  be  checked  in  a  cursory  way  by  plotting  the 


estimated  saddle  orbit  in  the  original  time  se¬ 
ries  coordinates.  Figure  9  shows  the  saddle  orbit 
obtained  when  the  sum  of  squares  is  minimized 
using  the  first  5  singular  vectors  ** . 

However,  in  the  d  =  5  case,  the  power  spec¬ 
tral  density  associated  with  higher  frequencies  in 
the  processed  data  is  somewhat  larger  than  that 
in  [2  ] .  Because  the  data  are  chaotic,  it  is  difficult 
to  decide  whether  this  higher  “noise  floor”  cor¬ 
responds  to  a  more  faithful  preservation  of  the 


**  The  computer  time  required  is  the  same  as  for  the  pe¬ 
riodic  data. 
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Fig.  9.  The  saddle  orbit  estimated  from  the  chaotic  Cou- 
ette-Taylor  data. 


dynamics,  a  less  effective  noise  reduction  strat¬ 
egy,  or  some  combination  of  the  two. 

The  calculation  of  the  singular  value  decompo¬ 
sition  is  subject  to  outliers  and  influential  points 
as  described  in  section  2.2.  There  are  some  rela¬ 
tively  large  “glitches”  in  these  experimental  data 
that  could  affect  the  resulting  singular  vector  ba¬ 
sis,  thus  limiting  the  effectiveness  of  the  noise 
reduction  procedure.  The  author  tried  iterating 
the  process  (i.e.,  using  the  output  of  one  run — 
where  most  of  the  glitches  are  gone — as  the  in¬ 
put  of  the  next),  but  the  resulting  power  spectra 
are  very  similar. 

Finally,  the  author  checked  the  effects  of  dif¬ 
ferent  starting  values  for  the  matrix  A  and  fixed 
point  I'f  in  eq.  (8).  Sometimes  the  starting  guess 
for  A  came  from  ordinary  linear  regression,  and 
sometimes  the  initial  guess  was  the  identity  ma¬ 
trix.  Likewise,  the  fixed  point  Vf  was  originally 
set  to  zero  (corresponding  to  the  mean  of  the  ob¬ 
served  trajectories),  and  sometimes  it  was  cho¬ 
sen  slightly  away  from  0.  In  all  cases,  the  numer¬ 
ical  method  converged  to  the  same  values  of  A, 
Vf,  and  adjusted  observations  v  j. 


4.  Conclusion  and  remarks 

The  noise  reduction  scheme  described  here  is 
applicable  only  to  recurrent  trajectories  near  a 
periodic  orbit.  Points  that  are  not  near  a  peri¬ 
odic  saddle  orbit  will  not  be  adjusted.  A  noise 
reduction  method  like  that  in  [2]  can  be  used  to 
adjust  such  non-recurrent  points. 

In  some  cases,  where  the  data  set  is  either 
strongly  quasiperiodic  or  highly  chaotic,  few 
points  are  recurrent,  and  the  method  described 
here  is  not  useful.  For  example,  the  distance  be¬ 
tween  two  typical  nearby  points  on  the  Henon 
attractor  [15]  approximately  doubles  after  each 
iteration  of  the  map.  For  this  reason,  few  recur¬ 
rent  trajectories  can  be  found  in  a  moderately- 
sized  data  set. 

However,  recurrent  trajectories  are  common 
in  systems  that  mimic  many  kinds  of  experi¬ 
mental  data.  For  instance,  the  author  has  found 
that  slightly  more  than  half  the  points  are  recur¬ 
rent  in  a  time  series  of  65  000  x  coordinates  ob¬ 
tained  by  numerical  integration  of  the  Lorenz 
equations  [16]  with  the  usual  parameters*’ . 

The  recurrent  trajectories  near  periodic  orbits 
on  an  attractor  are  a  natural  place  to  find  an 
accurate  linear  approximation  of  the  dynam¬ 
ics.  The  time  series  values  comprising  the  tra¬ 
jectories  can  be  treated  as  a  sequence  of  long 
vectors.  The  singular  value  decomposition  can 
be  exploited  to  project  them  onto  a  low  di¬ 
mensional  subspace  where  a  least-squares  min¬ 
imization  method  can  be  used  to  determine  an 
approximate  Jacobian  matrix  and  a  more  self- 
consistent  set  of  slightly  adjusted  observations. 
The  adjusted  time  series  is  obtained  by  a  simple 
change  of  coordinates.  In  contrast  to  ordinary 
least  squares,  the  procedure  uses  the  functional 
relationship  between  the  recurrent  trajectories 
in  the  least  squares  minimization.  There  is  no 
artificial  distinction  between  dependent  and 


#7 


The  attractor  is  embedded  in  4  dimensions  and  (  is  set 
to  3%  of  the  attractor  extent  in  the  search  for  recurrent 
Doints. 
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independent  variables,  which  may  avoid  the 
bias  resulting  from  errors  in  measurement.  The 
method  is  computationally  efficient,  since  only 
one  map  needs  to  be  determined  for  each  set  of 
recurrent  orbits. 
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The  Ott-Grebogi-Yorke  control  method  is  analyzed  in  the  case  that  the  attractor  is  reconstructed  from  a  time  series 
using  time-delay  coordinates.  It  turns  out  that  the  control  formula  of  OGY  should  be  modified  in  order  to  apply  to 
experimental  systems  if  time-delay  coordinates  are  used.  We  reveal  that  the  experimental  surface  of  section  map  does  not 
only  depend  on  the  actual  parameter  but  also  on  the  preceding  one.  In  order  to  meet  this  dependence  two  modifications 
are  introduced  which  lead  to  a  better  performance  of  the  control.  To  compare  their  control  abilities  they  are  applied  to 
simulations  of  a  Duffing  oscillator.  The  issue  of  measurement  noise  is  considered. 


1.  Introduction 

In  1990,  Ott,  Grebogi  and  Yorke  (OGY)  [1] 
proposed  a  new  method  of  controlling  a  chaotic 
dynamical  system  by  stabilizing  one  of  the  many 
unstable  periodic  orbits  embedded  in  a  chaotic 
attractor,  using  only  small  time-dependent  per¬ 
turbations  in  some  accessible  system  parameter. 

This  method  has  attracted  the  attention  of 
many  physicists  interested  in  applications  of  non¬ 
linear  dynamics.  One  reason  for  this  is  that  OGY 
stress  that  all  values  needed  to  achieve  control 
can  be  obtained  from  an  experimental  signal 
starting  with  the  well-known  embedding  tech¬ 
nique  [2,  3].  Therefore,  the  control  method  can 
in  principle  be  applied  to  experimental  systems 
where  the  dynamical  equations  are  not  known. 
Indeed,  Ditto  et  al.  demonstrated  recently  [4) 
a  first  control  of  a  physical  system  using  the 
method  of  Ott,  Grebogi  and  Yorke. 

With  regard  to  possible  applications  we  inves¬ 
tigate  the  performance  of  the  control  method  in 
the  case  that  the  chaotic  attractor  is  recon¬ 


structed  by  the  embedding  technique.  This  is 
done  by  simulating  an  experimental  situation. 
The  damped  and  driven  Duffing  oscillator  is 
numerically  integrated  and  the  displacements  are 
taken  as  “experimental”  time  series  z{t).  Using 
delay  coordinates,  the  attractor  is  reconstructed 
and  the  unstable  periodic  point  which  one  wants 
to  stabilize  is  determined  in  an  experimental 
surface  of  section.  It  turns  out  that  the  control 
method  of  OGY  should  be  modified  in  order  to 
apply  to  experimental  systems  if  time-delay  coor¬ 
dinates  are  used.  The  main  argument  will  be  that 
during  the  control  process  one  switches  the  con¬ 
trol  parameter  p  from  /7,_|  to  at  times  f,  (f, 
time  of  the  ith  piercing  of  the  surface  of  section 
by  the  trajectory).  But,  if  one  uses  delay  coordi¬ 
nates,  the  experimental  surface  of  section  map  P 
will  not  only  depend  on  the  new  actual  parame¬ 
ter  p,  (as  OGY  implicitly  assume)  but  also  on  the 
preceding  one  p,_i.  Our  modifications  of  the 
algorithm  will  consider  these  dependencies. 

The  paper  is  organized  as  follows.  In  section  2 
we  briefly  recall  the  OGY  method  and  introduce 
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our  notation.  In  section  3  the  problems  arising 
for  the  control  method  using  delay  coordinates 
are  studied  and  the  modifications  mentioned  are 
propiosed.  In  section  4  the  performance  of  the 
different  versions  of  the  method  is  compared  in 
the  case  of  the  Duffing  oscillator.  With  regard  to 
possible  applications  the  robustness  of  the  con¬ 
trol  method  in  the  presence  of  measurement 
noise  is  briefly  considered  in  section  5. 


2.  The  OGY  control  method 

For  simplicity  we  restrict  ourselves  to  a  two- 
dimensional  discrete  dynamical  system.  This  re¬ 
striction  is  mainly  for  the  ease  of  presentation. 
There  do  also  exist  extensions  of  the  method  to 
higher  dimensional  dynamical  systems  [5,6].  Let 
the  system  depend  on  some  accessible  parameter 
A)  +  Sp„ax)  with  maximal  pos¬ 
sible  perturbation 

|,.elR^  Po  +  8Pmax)-  (1) 

We  assume  that  for  p  =  Po  the  system  is  on  a 
chaotic  attractor.  Let  ^p  =  /*(^p,  p„)  denote  the 
unstable  fixed  point  on  the  attractor  which  one 
wants  to  stabilize.  The  control  idea  is  to  monitor 
the  system  until  it  comes  close  to  the  desired 
fixed  point  and  then  change  p  by  a  small  amount 
such  that  the  next  state  4;  +  ,  will  fall  into  the 
stable  direction  of  the  fixed  point.  To  do  this, 
one  uses  the  first-order  approximation  of  (1) 
near  ^p  and  p„, 

^,+  1  =  PUt,  Pi)  -€f+  if)  +  H'(p,  -  Pa) 

(2) 

or 

=  +  (3) 

where  8^,  =  -  €f  and  bp,  =  p^-  Pa  and  A  = 


p„)  is  a  2x2  matrix  and  w  =  (hPi 
^P)(^F’  Pa)  a  two-dimensional  vector. 

As  the  unstable  fixed  point  ^p  is  embedded  in 
a  chaotic  attractor  the  linearization  of  A  posses¬ 
ses  an  eigenvalue  with  |AJ<  1  and  corre¬ 
sponding  eigenvector  and  an  eigenvalue 
with  1a„|  >  1  and  eigenvector  The  eigenvec¬ 
tors  and  e„  give  the  directions  of  the  local 
stable  and  unstable  manifold  of  the  fixed  point. 
Let  and  f„  be  the  contravariant  basis  vectors. 
•  e  /.  •  =/u  •  «..  =  0  and/,  •  e,  =/  •  =  1.  Then 

A  can  be  written  as  A  =  A„e„/  +  A,.e,/.  The 
condition  that  +  ,  falls  on  the  local  stable  man¬ 
ifold  of  the  fixed  point  can  now  be  formulated  as 

/„*8|,.,=0.  (4) 

If  comes  close  to  ^p  the  linearization  (3)  holds. 
The  control  requirement  (4)  can  then  be  applied 
which  yields  for  the  new  value  of  the  control 
parameter  p,  -  p„  -f  8p, 

=  (5) 

The  control  is  only  activated  if  the  resulting 
change  in  the  parameter  bp,  is  less  than  the 
maximal  allowed  disturbance  otherwise 

8p,  is  set  to  zero.  As  usual  it  is  assumed  that  the 
occurring  denominator  in  (5)  does  not  vanish. 
After  ^,,.1  has  fallen  on  the  local  stable  manifold, 
the  parameter  perturbation  8p,  could  be  set  to 
zero.  But  because  of  errors  in  the  determination 
of  the  linearization,  measurement  errors,  or  a 
small  amount  of  noise,  the  system  will  in  general 
tend  to  fall  off  the  stable  manifold  again.  There¬ 
fore,  the  control  process  (5)  has  to  be  activated 
at  every  time  step  i  to  keep  the  successive  points 

near  ^p. 

Finally  we  note  that  the  control  law  (5)  looks 
different  to  the  one  in  ref.  [Ij.  The  reason  for 
this  is  that  we  use  the  linearization  around  ^p  and 
Po  (as  is  used  in  ref.  [5])  and  do  not  estimate  the 
new  position  of  the  fixed  point  ^p(  p)  as  one 
changes  the  parameter  p. 
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^f( P)  ^f(  Pu  +  Sp)  =  ^f( P„)  +  g  8p  ,  (6) 

as  OGY  do.  But  it  is  easy  to  show  that  these 
approaches  are  equivalent  and  g  and  w  are  re¬ 
lated  through 

g  =  il-  p„)y'w  .  (7) 

3.  OGY  control  algorithm  in  the  case  of  delay 
coordinates 

Now  we  consider  the  case  that  the  dynamical 
equations  are  not  known.  We  assume  that  the 
only  information  about  the  system  is  obtained  by 
some  measurement  process  which  is  mathemati¬ 
cally  realized  by  some  scalar  function  Z  on  the 
state  space  Af,  Z:  M^U.  If  Y(t)  €  A/  is  the  state 
of  the  system  at  time  t,  we  obtain  the  experimen¬ 
tal  time  series  z{t)  =  Z(Y(l)).  We  assume  further 
that  the  system  has  settled  down  on  an  attractor 
which  lies  on  some  manifold  My  C  M.  Using 
time  delay  coordinates  with  delay  t  and  embed¬ 
ding  dimension  d,  a  ^/-dimensional  delay  coordi¬ 
nate  vector  is  formed  X(t)  =  (z(t),  z(t-T), 

.  .  .  ,z(t-(d-  1)t)) e R''.  For  appropriately 
chosen  d  and  t  [3]  there  exists  a  smooth,  invert¬ 
ible  mapping  0  from  My  to  a  submanifold  M;^  C 
IR"^,  such  that 

X(0  =  ^(Y(t)),  Y(0eMy,  (8) 

holds.  For  a  nice  visualization  see  ref.  [7].  Let 
the  dynamics  in  the  original  phase  space  be 
represented  by  a  flow  <p  such  that  Y(t  +  t)  = 
(p'^(Y(t))  holds.  This  continuous  dynamical  sys¬ 
tem  induces  a  discrete  dynamical  system  in  an 
appropriately  chosen  experimental  surface  of 
section.  It  is  obtained  by  the  common  choice  that 
one  component  of  X(t)  equals  a  constant,  e.g.  by 
the  condition  (Y(t;)),  =  z(ti)  =  c  =  const.  This 
procedure  gives  the  successive  points 
i,  :=  (z(l,  -  t),  .  .  .  ,  z(t,  -(d-  1)t))  £  R"-'  in 
the  surface  of  section. 

In  what  follows  we  focus  our  interest  on  the  so 


obtained  surface  of  section  map  , 

For  the  sake  of  simplicity  let  us  assume  that  one 
wants  to  stabilize  an  unstable  fixed  point  of  P 
which  has  been  localized  by  the  well-known  tech¬ 
nique  of  recurrent  points  [8-10],  Applying  the 
OGY  control  algorithm  implies  that  one  (in¬ 
stantaneously)  changes  at  the  times  /,  the  param¬ 
eter  p  from  p,_,  to  an  appropriately  chosen 
parameter  /?,  using  (5).  Let  us  now  assume  that 
the  time  between  successive  piercings  of  the 
surface  of  section  is  larger  than  the  lag  window, 
i.e.  -  ti>{d-  1)t. 

The  reason  that  one  hopes  to  be  able  to 
control  the  original  system  Y{t)  by  observing  X{t) 
is  that  the  introduced  embedding  4>  gives  a 
bijective  relation  between  the  states  X{t)  and 
Y{t).  The  mapping  ^  is,  however,  closely  related 
to  the  dynamical  equations  of  the  system  and 
thus  in  general  dependent  on  the  actual  value  of 
the  control  parameter  p,.  We  will  take  this  fact 
into  account  by  writing  instead  of  <P.  Our 
argumentation  is  now  as  follows.  The  point  at 
time  t,  in  the  surface  of  section  is  related  to  the 
original  state  by 

F(r,)  =  0;/  ,(c.  z{t,  -  T), . .  . ,  2(f,  -id-  1)t)) 

=  0;;,(3f(t,)).  (9) 

Here  we  make  use  of  our  assumption  that  {d  - 
~  ^-i)!  which  assures  that  p,_,  is  the 
actual  value  of  p  during  the  whole  time  interval 
The  flow  in  the  original  state  space 
which  describes  the  development  of  the  system 
from  time  /,  to  the  time  of  the  (/'  -i-  l)th 
piercing  is  in  case  of  activated  control  given  by 
tp'p*'  and  thus  the  state  of  the  system  at  time 
is  obtained  by 

nf,.,)  =  v';;'"'(F(t,))  (10) 

and  the  corresponding  state  in  the  embedding 
space  by 

Xit..,)  =  <Pp_iYit,^,)). 


(11) 
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Thus  we  finally  obtain  using  eqs.  (9)-(ll) 

°  v';;  °  ))  •  (12) 

This  yields  our  main  conclusion.  In  the  case  of 
activated  control  (i.e.  switching  the  parameter 
from  p,_,  to  Pj  at  time  r,)  the  experimental 
surface  of  section  map  P  does  depend  not  only 
on  the  new  actual  value  but  also  on  the 
preceding  value  Pi_,,  i.e. 

=  (13) 

Note  that  if  the  assumption  (/,  -  r,  _ , )  >  (d  -  1)t, 
(for  all  i),  is  broken  and  e.g.  only  2(t^  -  r,_  .)> 
(d  —  1)t  holds  then  the  result  is  straightforwardly 
generalized  to  =  Pil,  Pi-z,  Pi-i>  P.)  '-e. 
dependencies  on  further  preceding  values  of  p 
have  to  be  taken  into  account.  For  the  easiness 
of  notation  we  restrict  ourselves  here  on  the  case 
(t,  -  t,-i)  >  (d  -  1)t  for  all  i. 

Taking  (13)  as  starting  point  the  algorithm  of 
OGY  is  straightforwardly  extended.  The  lineari¬ 
zation  which  one  has  to  consider  now  is  given  by 

,  =  >1  8^^  -I- 1;  8p,_ ,  +  u  8p,  ( 14) 

with  A  =  Po,  Po).  V  =  (aF/ap,-,) 

(^F.  Po)»  and  «  =  Po>  Po)-  De¬ 

manding /u  •  8^,^,  =  0  renders  the  new  control 
formula 


8p,= 


8p,  -i  ■ 


(15) 


When  P  is  not  influenced  by  the  preceding  pa¬ 
rameter  perturbation  8p,_i,  which  is  equivalent 
to  i;  =  0,  the  original  OGY  control  formula  (5) 
should  be  reobtained.  To  see  this  we  note  that 
the  vector  w  in  the  control  formula  (5)  is  related 
to  u  and  v  by  w  =  u  +  v.  This  is  so  because 
calculating  w  in  practice  (see  also  section  4) 
means  that  one  changes  p„  to  Po  +  p,  p  small, 
and  observes  for  a  while  how  the  behavior  of  P 
changes  in  the  neighborhood  of  ^p,  i.e.  one  looks 


at  P(^p  +  8^,  p,|  +  p,  Po  +  p)  and  therefore  one 
determines  w  =  u  +  v. 

The  control  formula  (15)  has  to  be  applied  in 
principle  all  over  again  even  in  the  absence  of 
measurement  errors  or  noise.  In  case  of  acti¬ 
vated  control,  8p,  #  0,  the  control  requirement 
/„-8^y^,=0  does  not  guarantee  that  ^,  +  2  will 
also  fall  on  the  stable  manifold.  The  reason  for 
this  is  that  ^p  is  the  fixed  point  of  P{‘,  p„,  p„) 
and  not  of  P(',  p„-l-8p,.  p,,.,),  p,.^  =Po,  which 
would  determine  ^,^2  one  does  not  apply  a 
further  perturbation  8p,,.,.  Therefore,  will 
only  stay  on  the  local  stable  manifold  of  ^p  (i.e. 
/u‘^fi  +  2~0)  d  3  perturbation  Sp,,^,  is  chosen 
according  to  the  control  formula  (15).  This  is 
easily  checked  using  (14). 

In  section  3  we  show  examples  where  the 
control  law  (15)  is  already  able  to  control  sys¬ 
tems  for  which  the  OGY  algorithm  fails.  It  is, 
however,  not  sufficient  for  all  cases:  Consider 
the  case  that  !(/„•  !>)/(/,  •«)!  ^  1.  Regarding  the 
deviations  of  8^,  from  zero  as  stochastic,  then, 
(15)  is  equivalent  to  a  non-stationary  AR  (1) 
(autoregressive  of  order  1)  process  (see  ref. 
(11]),  i.e.  the  expectation  value  of  (8p,)‘  will 
diverge  for  growing  /  until  for  some  /,  8p,  will 
exceed  the  maximum  allowed  value  Spn,^,,  and 
the  range  of  control  will  be  left. 

To  avoid  this  instability  (i.e.  the  growing  of 
8p,)  we  propose  an  alternative  approach.  We  try 
to  find  a  control  law  for  8p,  such  that  hp^^^ 
automatically  will  become  zero.  We  do  this  be 
demanding  that  the  system  stabilizes  only  the 
next  but  one  step,  /  -1-2,  and  that  8p,  *,  equals 
zero,  i.e.  by  the  requirements 

and  5p,^,=0.  (16) 

Using  the  linearization  (14)  twice,  the  require¬ 
ment  (16)  yields  the  new  control  formula 


8p,  =  - 


Kfu-u+fu’V 


Kfu'V 


Kfa-U+f,-V 


(17) 
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(where  u,  v  have  again  to  fulfill  the  generic 
condition  that  the  occurring  denominators  are 
different  from  zero). 

The  new  control  requirements  lead  to  a  self- 
consistency  of  the  control  procedure.  Assuming 
a  perfect  linearization,  absence  of  noise  or  mea¬ 
surement  errors  this  time  the  perturbations  could 
in  principle  be  turned  off  because  the  iterate  ^,+3 
would  automatically  fall  on  the  stable  manifold. 
In  practice,  of  course,  the  error  in  the  lineariza¬ 
tion,  which  was  here  even  used  twice,  will  always 
require  to  activate  the  control  at  every  time  step 
i. 


4.  Application  of  the  control  algorithms  to  the 
Duffing  oscillator 

The  algorithms  discussed  above  have  been  ap¬ 
plied  to  simulations  of  a  Duffing  oscillator  [12]. 
This  oscillator  is  given  by 

x  + dx  +  x  + =  f  costat ,  (18) 

or,  equivalently  written  as 

i,  =  JC2 , 

X2  =  —dX2  -  X^  —  x\+  f  cos  <0X2  , 

X2  =  l.  (19) 

This  system  has  been  numerically  integrated.  As 
a  measurement  function  the  displacement  of  the 
oscillator  2(/)  =  jt,(/)  is  chosen.  We  use  a  three- 
dimensional  embedding  with  delay  time  t  =  j  T, 
T  =  I'nlto  being  the  period  of  the  driving.  The 
experimental  surface  of  section  was  obtained  by 
taking  (2f(t,)),  =  z{tj)  =  const.  To  insure  that  the 
trajectory  always  pierces  the  surface  of  section 
from  one  side  we  added  the  condition  z(t,)>0. 
In  this  way  a  sample  of  successive  points  = 
(z(t/  -  t),  z(t,  -  2t)),  I  =  1, . . . ,  yv  in  the  two- 
dimensional  surface  of  section  was  recorded. 

For  the  localization  of  fixed  points,  a  method 
described  in  refs.  [8,9]  was  used.  From  the 


sample  .  /  =  1 .....  /V,  the  n  best  recurrent 
points  , .  .  .  .  determined  by 

^min|lP(^,)-^,||.  (20) 

The  correct  grouping  of  the  recurrent  points  into 
classes  belonging  to  the  same  fixed  point  can  be 
accomplished  by  the  following  procedure:  For 
different  values  of  a  "maximum-distanee  param¬ 
eter”  e,  is  taken  as  a  master  point  for  the  first 
class.  is  attached  to  the  same  class  if  ||  ^,,  - 
<  e.  otherwise  it  is  taken  as  a  master  point 
of  a  new  elass,  and  so  on.  In  general  the  number 
of  classes  obtained  by  this  procedure  approaches 
1  for  large  values  of  e  and  n  for  small  values;  if 
it,  however,  remains  constant  over  a  broad  inter¬ 
mediate  range  of  e  values,  one  can  suppose  that 
the  correct  classification  has  been  found. 

Having  determined  a  class  of  recurrent  points, 
the  exact  value  of  the  corresponding  fixed  point 
and  the  linearization  around  '"“st  be 
found.  We  do  this  by  fitting  an  affine  mapping 

=  i=\ . Ae,  (21) 

to  the  pairs  ,  P(^,  )  belonging  to  the  class 
considered.  For  this  purpose  the  general  least 
square  method  described  in  [13,  p.  509  ff.,  518 
f.]  is  used.  For  k  =  ,  d  -  \  it  is  applied  to 

d-\ 

z*(0-2  akffi(i)  +  b^g{i) ,  i=l . Ac, 

(22) 

with  z*(/)  =  (F(^,  ))*  and  the  “nonlinear  basis 
functions”  g  given  by ^(/)  and  gii)  = 

1.  This  fitting  method  was  employed  iteratively: 
As  weights  for  lae  sample  points,  we  used  a 
function  of  their  distances  to  the  fixed  point 
In  the  first  step,  as  an  initial  guess  for  ^p.  the 
master  point  of  the  class  was  taken.  Having 
obtained  A  and  b,  a  new  guess  of  the  fixed  point 
^p  was  calculated  by  solving  =  ( 1  -  A)  'b.  The 
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iteration  was  continued  until  and  A  stayed 
approximately  constant  from  one  step  to  the 
other.  The  eigenvalue  A„  and  the  corresponding 
left  (contravariant)  eigenvector  /„  can  then  be 
easily  found. 

What  one  finally  needs  to  perform  control  are 
the  vectors  «  and  v.  Shifting  p  to  a  small  constant 
value  Po  +  p  and  observing  the  behavior  of  P  for 
a  while  as  it  is  done  in  the  OGY  algorithm  only 
renders  the  sum  w  =  u  +  v.  What  one  has  to  do 
instead  is  to  switch  alteniatingly  on  and  off  the 
perturbation  at  every  piercing  of  the  surface  of 
section  such  that  8p,  =  0  for  /  odd  and  8p,  =  p, 
for  i  even,  p  small,  respectively.  Regarding  all 
pairs  (^,,  with  even  i  as  one  group,  and  the 
ones  with  odd  i  as  another,  it  is  now  possible  to 
fit  affine  mappings  in  the  neighborhood  of  ^p  to 
Po  +  P>Po)  using  only  pairs  (f, , 
odd  and  to  /*(•,  Po,  Po +  p)  using  only  pairs 
!;,+»).  '■f  even,  respectively.  Using  these  fits, 
u  and  V  are  then  determined  by  the  relations 

PO  +  P.  Po)-|f  +  »P  1 

^If'Po«Po  +  P)-^f  +  “P-  (23) 

one  can  now  try  to  achieve  control. 

In  fig.  1  a  chaotic  attractor  of  the  Duffing 
oscillator  (with  d  =  0.2,  /=  36.,  w  =  0.661)  in  the 
embedding  space  is  shown.  For  this  attractor  the 
conditions  for  the  surface  of  section  are  chosen 
to  be  z(t,)=  1,  i(r,)>0  and  the  additional  re¬ 
quirement  z(r,  -T)<0  is  used.  The  surface  of 
section  is  indicated  in  fig.  1.  In  fig.  2  the  attractor 
in  the  surface  of  section  is  shown.  Three  fixed 
points  (marked  as  ^p, ,  ^p2,  and  ^pj)  embedded  in 
the  attractor  could  be  determined.  The  periodic 
orbits  corresponding  to  these  fixed  points  are 
presented  in  fig.  3. 

To  compare  the  performances  of  the  three 
different  versions  of  the  control  formula  we  tried 
to  stabilize  these  three  fixed  points  fp,,  ^p2,  and 
As  accessible  parameter  p  the  amplitude  of 
the  driving  /,  i.e.  p  =  /  and  Po  =  /o  =  36.  was 
chosen.  The  maximal  allowed  perturbation  was 
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Fig.  1.  For  the  Duffing  oscillator  (<f  =  0.2.  /=36.,  ut  = 
0.661)  a  two-dimensional  projection  of  the  chaotic  attractor 
in  the  three-dimensional  embedding  space  is  shown.  As  delay 
time  T  we  use  t  =  1  7  with  7  =  2n/ta.  The  line  marks  the  half 
plane  which  we  used  as  surface  of  section. 
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Fig.  2.  The  chaotic  attractor  of  fig.  1  in  the  surface  of 
section.  The  surface  of  section  was  obtained  by  the  condi¬ 
tions  2(f,)=l.,  i(f,)>0  and  2(r, -t)<0.  Three  unstable 
fixed  points  observed  are  indicated  by  the  crosses.  For  fur¬ 
ther  reference  we  call  them  fp,  and  f,,. 
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Fig.  3.  The  periodic  orbits  corresponding  to  the  fixed  points  in  the  surface  of  section  of  fig.  2  are  plotted  in  the  two-dimensional 
projection  of  the  embedding  space  (compare  fig.  1)  (a)  correspond  to  (b)  to  (f,  and  (c)  to 


Table  1 

The  three  control  formulas  used  for  control  with  u  = 
(dPIdp,)  (&,  v  =  (dPldp,_^)  (^F- P,-i- P,)-  and 

/,  =/,/||/,||.  The  constant  control  coefficients  (a,b,....) 
are  introduced  implicitly. 


OGY  (5) 


6p,  =  - 


/„•(«  +  ») 


MOOU15)  &p,  =  -^/,-8|,-^5p,., 
=  +  h,  8p,.| 

MOD2  (17)  8p  -  TT—  L  ■  H. 

KL-v  ,,, 

A„/. •«+/,•  p 

=  c>/.-8#, +  c,8p,_, 


fixed  upon  =  0.5.  In  table  1  the  different 
control  formul.is  are  summarized.  The  numeri¬ 
cally  obtained  values  of  the  coefficients  in  the 
control  formulas  for  the  three  fixed  points  are 
given  in  table  2.  These  coefficients  are  only 
accurate  up  to  a  percentage  of  about  10.  The 
reason  for  this  is  that  u  and  r  are  usually  very 
small.  To  determine  the  coefficients  one  has  to 
divide  by  small  numbers.  Therefore,  even  slight 
changes  in  u  and  v,  which  can  be  caused  by 
changing  the  number  of  neighboring  points  used 
for  the  fitting  of  F,  can  yield  a  noticeable  change 
in  the  coefficients.  Fortunately,  these  variations 


Table  2 


The  numerically  obtained  values  of  the  coefficients  in  the 
control  formulas  (see  table  1 )  for  the  three  fixed  points 
considered. 


a 

6, 

Cl 

C; 

A„ 

-U, 

-2.28 

-13 

39 

-1.87 

165 

-7 

1.04 

-9 

1  ' 

4.82 

-2.5 

-2 

0.20 

-1.8 

0.1. s 

-1.85 

in  the  coefficients  did  not  effect  the  possibl 
control  ability  of  the  control  algorithms. 

We  start  with  the  stabilization  of  the  fixed 
point  ^F)  •  In  fig.  4  the  three  different  control 
formulas  are  successively  applied.  As  can  be 
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Fig.  4.  (a)  First  component  ((,),  of  the  points  in  the  surface 
of  section  versus  i.  In  order  to  stabilize  the  fixed  point  the 
OGY  control  law  (5)  was  switched  on  from  i  =  0-200,  the 
control  law  ( 15)  from  /  =  201  -400.  the  control  law  (17)  from 
i  =  401-600.  and  again  OGY's  control  law  from  601-800.  As 
can  be  seen  only  the  procedure  ( 17)  was  able  to  stabilize  fpi  ■ 
(b)  The  parameter  perturbations  8p, versus  i  used  for  control 
are  shown.  The  maximal  allowed  disturbance  was  8p„^,  = 
0.5. 


160 


G.  Nitsche,  U.  Dressier  /  Controlling  chaotic  dynamical  systems 


seen  only  the  second  modification  (17)  was  able 
to  stabilize  ^Fl  •  The  coefficients  of  the  control 
formulas  (see  table  2)  explain  why  the  first  modi¬ 
fication  (15)  of  the  OGY  algorithm  did  not 
work.  The  criterium  for  a  stable  control  al¬ 
gorithm  jb2l  <  1  was  broken.  The  large  absolute 
value  of  =/u  •  n//u  •  «,  here  62  = -13.,  indi¬ 
cates  further  that  the  influence  of  the  change  of 
the  preceding  parameter  p,_,  is  relatively  larger 
than  that  of  the  actual  one  p,.  But  this  is  exactly 
what  is  neglected  if  one  applies  the  original 
approach  of  OGY  without  considering  the  mean¬ 
ing  of  the  time-delay  coordinates. 

The  stabilization  of  the  second  fixed  point 
shows  different  features.  Here  the  generic  condi¬ 
tion  (/u  •  w  0)  of  the  OGY  formula  is  almost 
violated.  Because  of  the  resulting  large  value  of 
the  coefficient  a  (a  =  165.)  there  were  only  rare 
cases  where  the  control  requirement  <  Sp^^^ 
was  met.  But  even  then  the  control  range  was 
soon  left  without  succeeding  in  control.  In  fig.  5 
the  OGY  algorithm  is  first  applied,  then  the  first 
modification  (15)  is  used  to  control  Th® 
coefficient  (^2  =  104)  just  violates  the  stabili¬ 
ty  criterium.  Indeed,  the  used  perturbations  8p, 


increase  at  the  beginning.  But  finally,  probably 
due  to  nonlinear  effects,  the  control  procedure 
stabilizes  and  the  algorithm  is  capable  of  achiev¬ 
ing  control.  After  200  iterations  at  /  =  401  the 
second  modification  (17)  is  activated.  The  fixed 
point  ^F2  stays  controlled,  but  the  perturbations 
8p,  needed,  drastically  decrease.  This  could  be 
expected  thinking  of  how  this  control  formula 
(17)  was  derived  using  the  requirement  Sp,^,  = 
0. 

The  third  fixed  point  could  be  stabilized  by 
any  of  the  three  versions  of  the  control  formula. 
For  the  coefficients  of  the  control  formulas 
are  very  similar  (see  table  2).  The  coefficient  fc, 
is  relatively  small  (6,  =  0.2)  which  indicates  the 
small  influence  of  Sp,  ^,  compared  to  8p,  .  So  one 
can  expect  that  all  the  three  algorithms  will 
work.  In  addition,  all  coefficients  are  relatively 
small  compared  to  the  ones  of  ^p,  and  ^p,  which 
also  gives  a  hint  that  it  is  not  hard  to  stabilize 

^F.l- 

In  fig.  6  we  switched  from  controlling  ^p,  to 
controlling  ^F2  and  then  in  using  the  second 
modification  (17)  which  was  able  to  achieve  con¬ 
trol  for  all  three  fixed  points. 


-l.S-TT— r 

^  -2-5-  .  ’ 
-3.0- 

-3.5 - 


Fig.  5.  (a)  First  component  d,),  of  the  points  in  the  surface  of  section  versus  i.  In  order  to  stabilize  the  control  procedures 
were  successively  initiated.  From  0-200  OGY's  law,  from  201-400  (15).  and  from  401-6(X)  (17).  (15)  and  (17)  succeeded  in 
stabilizing  Because  of  the  large  value  of  the  coefficient  a  (a  -  165.)  in  the  OGY  formula  the  parameter  perturbations  did 
not  happen  to  fulfill  hp,  <  6p„,,  =  0.5. 
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Fig.  6.  Using  the  control  law  (17)  successive  control  of  the  fixed  points  fp,  and  ^pj  could  be  achieved.  From  i  =  0-200  the 
control  formula  (17)  as  applied  to  stabilize  fp,,  from  201-400  to  stabilize  fp,  and  from  401-600  to  stabilize  fF.V 


Ditto  et  al.  [4]  successfully  applied  the  original 
OGY  control  formula  to  a  real  experiment. 
Their  experimental  system  was  periodically 
driven  with  period  T.  As  time  series  they  took  a 
stroboscopic  measurement  x(t,),  /,  -  f,_i  =  T.  In 
this  way  they  obtained  a  surface  of  section  with 
points  ii  =  (x(ti),  x(t/_i)).  We  also  tested  the 
three  versions  of  the  control  method  using  this 
way  of  obtaining  a  surface  of  section.  The  per¬ 
iodic  motion  corresponding  to  could  be 
stabilized  by  all  three  algorithms.  They  were 
almost  equivalent  because  62  ^nd  C2  were  nearly 
zero  (of  the  order  of  10”“*),  so  the  other  co¬ 
efficients  were  practically  the  same  (a  =  fe,~ 
c,  «2.7).  The  periodic  motion  corresponding  to 
ip2  could  not  be  stabilized  because  the  embed¬ 
ding  in  the  neighborhood  of  the  fixed  point  was 
bad  (not  injective).  The  third  fixed  point  finally 
could  only  be  stabilized  using  the  second  modi¬ 
fication  (17). 

Altogether  the  numerical  investigations  show 
that  the  possibility  to  stabilize  a  fixed  point  is  not 
an  intrinsic  property  of  a  fixed  point,  as  the 
eigenvalues  A„  and  A,  are  for  example.  The  al¬ 
gorithms  were  also  tested  using  further  surfaces 
of  section.  In  any  case  it  is  confirmed  that  the 


quality  of  the  embedding  in  the  neighborhood  of 
the  fixed  point  is  of  crucial  importance  for  the 
success  of  the  control  procedure.  Furthermore, 
the  coefficients  of  the  control  formulas  differ  for 
different  surfaces  of  section  and  so  do  their 
performances.  We  always  observed  that  the  first 
modification  (15)  only  worked  successfully  when 
the  second  modification  (17)  could  achieve  con¬ 
trol,  too.  The  original  OGY  control  formula  was 
able  to  achieve  control  only  when  the  two  modi¬ 
fications  also  had  success.  We  never  saw  that  the 
second  modification  (17)  failed  and  any  of  the 
other  methods  was  successful.  But  we  did  ob¬ 
serve  that  the  OGY  formula  failed  and  the  appli¬ 
cations  of  one  of  the  modifications  could  stabilize 
the  desired  fixed  points.  As  a  rule  this  happened 
when  the  influence  of  the  preceding  parameter 
was  noticeable,  which  resulted  in  a  non  neglig¬ 
ible  value  of  /u  •  V. 

5.  Measurement  noise 

With  regard  to  possible  applications  of  the 
control  method  to  experimental  situations  the 
robustness  of  the  method  in  the  presence  of 
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noise  is  of  special  interest.  Noise  can  cause  sev¬ 
eral  problems.  As  was  already  mentioned  in  ref. 
[1],  noise  can  kick  an  orbit  out  of  the  control 
region  such  that  the  control  requirement  < 
SPmax  sets  broken.  In  case  that  the  attractor  has 
to  be  reconstructed  from  a  measurement  signal 
further  difficulties  arise.  In  what  follows  we  re¬ 
strict  ourselves  to  the  issue  of  measurement 
noise. 

A  major  problem  already  occurs  in  the  prepro¬ 
cessing  of  the  data.  The  numerically  obtained 
values  for  the  fixed  point,  its  linearization  A  and 
the  vectors  u  and  v  will  become  more  and  more 
imperfect  if  the  noise  level  increases.  This  affects 
directly  the  resulting  control  coefficients  of  the 
control  formulas. 

During  the  control  process  the  exact  position 
of  the  system  in  the  experimental  surface  of 
section  is  no  longer  known.  Therefore,  the  con¬ 
trol  process  cannot  act  exactly  in  the  desired  way 
which  limits  the  control  abilities  of  the  control 
process  as  well. 

To  study  the  effects  of  measurement  noise  a 
noise  term  <5  was  added  to  the  measurement 
function  z{t).  The  random  variable  5  was  chosen 
to  be  identically  distributed  in  the  interval 
[— 1, 1].  The  parameter  e  specifies  the  intensity 
of  the  noise.  The  surface  of  section  of  fig.  1  was 
used. 

For  a  noise  level  e  =  3  x  10"^  the  second  modi¬ 
fication  was  able  to  stabilize  ail  three  fixed 
points.  In  fig.  7  the  fixed  points  ^p,,  ^p,,  ^pj  and 
again  ^p,  are  successively  stabilized.  As  can  be 
seen  the  required  control  signal  8p,  is  this  time 
much  bigger  than  for  the  system  without  noise 
(see  fig.  6).  For  e  =  3  x  10“  ’  the  second  modi¬ 
fication  gives  the  only  control  formula  which 
stabilizes  any  of  the  fixed  points.  While  for 
noiseless  data  all  control  formulas  could  stabilize 
^F3  the  OGY  formula  and  the  first  modification 
lose  this  ability  (see  fig.  8).  Therefore,  the  sec¬ 
ond  modification  seems  to  be  more  robust  to 
measurement  noise. 

But  already  the  noise  level  e  =  5  x  10'^  leads 
to  a  collapse  of  the  control  performance  of  all 
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Fig.  7.  For  a  noise  level  of  e  =  3  x  10"'  the  fixed  points 
^F2'  again  fp,  are  successively  stabilized  using  the 

second  modification  (17).  The  control  parameters  were  de¬ 
termined  from  the  noisy  signal. 
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Fig.  8.  For  a  noise  level  of  e  =  3  x  10  ’  ^  we  tried  to  stabilize 
the  third  fixed  point  using  successively  the  OGY  control 
formula  (i  =  1-400),  the  first  modification  (i  =  401-800),  the 
second  modification  (/  =  801-1200),  and  again  the  OGY 
formula  (i  =  1201-1600).  Only  the  second  modification  was 
able  to  stabilize  fp-. 

three  versions  of  the  control  algorithm.  Here  the 
main  problem  comes  from  the  imperfect  numeri¬ 
cal  determination  of  the  control  parameter  (con¬ 
trol  coefficients  and  position  of  the  fixed  points). 
They  were  straightforwardly  determined  without 
using  any  noise  reduction  techniques  (e.g.  the 
recently  developed  techniques  [14,  15|).  The  to¬ 
lerable  noise  level  can  be  increased  if  the  control 
parameters  were  taken  from  the  noise  free  data 
(this  is  of  course  not  possible  in  an  experimental 
situation).  In  fig.  9  the  system  is  spoiled  with  a 
noise  level  e  =  10“  This  time  the  second  modi¬ 
fication  is  able  to  stabilize  ^p,  if  the  noise  free 
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Fig.  9.  With  the  control  parameters  (Table  II)  of  the  noise¬ 
less  time  series  the  control  method  was  applied  to  a  noisy 
signal  with  c  =  10"’  to  stabilize  the  third  fixed  point  f,:,. 
From  /  =  1-400  OGY’s  formula  was  used,  from  i  =  401-800 
the  first  modification  and  from  i  =  801-1200  the  second 
modification  and  finally  from  /  =  1201-1600  again  the  OGY 
formula.  Only  the  second  modification  could  stabilize 

control  parameters  are  used.  The  two  other  ver¬ 
sions  in  contrast  fail  to  control  ^F.V  But  for  the 
other  two  fixed  points  (|fm  If:)  ^**0  the  second 
modification  loses  its  control  ability. 

So  far  the  numerical  simulations  show  that 
noise  can  cause  severe  problems  for  the  success 
of  the  control  method.  To  quantify  these  prob¬ 
lems  and  to  incorporate  e.g.  noise  reduction 
methods  as  a  remedy  a  lot  of  work  has  to  be 
done  in  the  future. 


6.  Summary  and  conclusions 

We  investigated  the  control  method  of  Ott, 
Grebogi  and  Yorke  in  the  case  that  one  uses 
time-delay  coordinates  to  reconstruct  the  attrac¬ 
tor  from  a  time  series.  It  turned  out  that  during 
the  control  process  (switching  on  and  off  the 
parameter  perturbations  8p,)  the  experimental 
surface  of  section  mapping  P,  whose  fixed  points 
one  wants  to  stabilize,  does  depend  not  only  on 
the  new  value  p,  at  times  t,  time  of  the  /th 
piercing  of  the  surface  of  section  by  the  trajec¬ 
tory)  but  also  on  the  preceding  parameter  ^ 
which  was  valid  in  the  time  interval  f,).  i.e. 

holds.  Using  this  equation 


as  starting  point  two  modifications  of  the  original 
OGY  algorithm  were  proposed.  The  first  modi¬ 
fication  (15)  is  a  straightforward  extension  of  the 
OGY  algorithm.  It  only  takes  the  dependence  on 
p,  ,  into  account  which  results  in  the  appearance 
of  V  =  c)PI dp^^  ^  and  of  8p,_,  in  the  control  for¬ 
mula  (15). 

The  second  mcydification  (17)  was  introduced 
as  a  remedy  for  a  possible  instability  (possible 
increasing  of  the  applied  perturbations  8p,)  of 
the  first  modification.  This  would  occur  when  the 
coefficient  preceding  jp,  ,  in  (15)  exceeds  1 
which  is  equivalent  to  \f^  •  nj  >  \f^-u\.  In  this 
case  we  propose  to  stabilize  by  requiring  that 
the  system  stabilizes  only  the  next  but  one  step, 
i.e.  /u’S|,  +  2  =  0  and  that  the  perturbation 
needed  in  the  next  step  8p,.,.|  equals  ze^o.  These 
requirements  yield  the  second  modification.  In 
table  1  the  original  OGY  formula  and  the  two 
modifications  are  listed. 

These  three  control  formulas  are  applied  to 
simulations  of  a  damped  and  driven  Duffing 
oscillator.  The  Duffing  oscillator  is  numerically 
integrated  and  its  displacements  are  taken  as 
experimental  time  series.  Using  delay  coordi¬ 
nates  the  attractor  is  reconstructed.  The  control 
capabilities  of  the  different  control  algorithms 
are  demonstrated  activating  them  to  stabilize  the 
three  fixed  points  which  are  determined  in  the 
surface  of  section.  We  find  that  the  performance 
of  the  first  modification  is  superior  to  the  one  of 
the  original  OGY  formula  and  me  second  modi¬ 
fication  outperforms  the  latter  two.  However, 
their  performances  are  similar  whenever  the  in¬ 
fluence  of  the  preceding  parameter  perturbations 
8p,_,  is  small  which  results  in  a  small  value  of 
L'v. 

Finally  the  issue  of  measurement  noise  was 
considered.  Only  for  very  small  noise  levels  the 
second  modification  preserved  its  control  ability. 
But  already  at  a  noise  level  of  5  x  10  '  also  the 
second  modification  failed.  At  the  present  stage 
of  the  development  of  the  control  method  the 
control  performance  is  not  sati  factory  in  the 
presence  of  noise.  In  the  immediate  future  an 
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improvement  might  come  by  applying  the  recent¬ 
ly  developed  noise  reduction  methods  to  get  the 
necessary  control  parameters  more  accurate.  As 
in  other  areas  of  time  series  analysis  the  problem 
of  noise  will  stay  in  the  center  of  interest. 

In  conclusion,  we  introduced  two  modifica¬ 
tions  of  the  control  formula  of  OGY  which  can 
lead  to  a  better  performance  of  the  control  in  the 
case  that  the  dynamical  system  is  reconstructed 
using  time  delay  coordinates.  Therefore  these 
modifications  extend  the  range  of  applicability  of 
the  OGY  control  method.  With  these  modi¬ 
fications  all  remarkable  advantages  of  the  OGY 
control  method  are  preserved,  as  there  are  e.g., 
the  dynamics  equation  are  not  required,  the 
perturbations  of  the  accessible  parameter  can  be 
very  small,  different  periodic  points  can  be 
stabilized  in  the  same  parameter  range  for  the 
same  system,  and  after  having  determined  the 
control  coefficients  the  computational  effort  at 
every  iteration  is  negligible  and  therefore  the 
possibility  of  real  time  applications  is  given.  We 
expect  that  the  OGY  control  method  will  yield 
important  applications  in  the  future  also  for  tech¬ 
nical  systems. 
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We  describe  a  method  that  converts  the  motion  on  a  chaotic  attractor  to  a  desired  attracting  time  periodic  motion  by 
making  only  small  time  dependent  perturbations  of  a  control  parameter.  The  time  periodic  motion  results  from  the 
stabilization  of  one  of  the  infinite  number  of  previously  unstable  periodic  orbits  embedded  in  the  attractor.  The  present 
papter  extends  that  of  Ott,  Grebogi  and  Yorke  (Phys.  Rev.  Lett.  64  (1990)  1196],  allowing  for  a  more  general  choice  of  the 
feedback  matrix  and  implementation  to  higher-dimensional  systems.  The  method  is  illustrated  by  an  application  to  the 
control  of  a  periodically  impulsively  kicked  dissipative  mechanical  system  with  two  degrees  of  freedom  resulting  in  a 
four-dimensional  map  (the  “double  rotor  map”).  A  key  issue  addressed  is  that  of  the  dependence  of  the  average  time  to 
achieve  control  on  the  size  of  the  perturbations  and  on  the  choice  of  the  feedback  matrix. 


1.  Introduction 

It  is  common  for  systems  to  evolve  with  time 
in  a  chaotic  way.  In  practice,  however,  it  is  often 
desired  that  chaos  be  avoided  and/or  that  the 
system  be  optimized  with  respect  to  some  per¬ 
formance  criterion.  Given  a  system  which  be¬ 
haves  chaotically,  one  approach  might  be  to 
make  some  large  (and  possibly  costly)  alteration 
in  the  system  which  completely  changes  its 
dynamics  in  such  a  way  as  to  achieve  the  desired 
objectives.  Here  we  assume  that  this  avenue  is 
not  available.  Thus  we  address  the  following 
quesiion:  Given  a  chaotic  system,  how  can  we 
obtain  improved  performance  and  achieve  a  de¬ 
sired  attracting  time-periodic  motion  by  making 
only  small  controlling  temporal  perturbations  in 
an  accessible  system  parameter. 


The  key  observation  is  that  a  chaotic  attractor 
typically  has  embedded  densely  within  it  an  infi¬ 
nite  number  of  unstable  periodic  orbits  [1-5].  In 
addition,  chaotic  attractors  can  also  sometimes 
contain  unstable  steady  states  (e.g.,  the  Lorenz 
attractor  has  such  an  embedded  steady  state). 
Since  we  wish  to  make  only  small  controlling 
perturbations  to  the  system,  we  do  not  envision 
creating  new  orbits  with  very  different  properties 
from  the  already  existing  orbits.  Thus  we  seek  to 
exploit  the  already  existing  unstable  periodic  or¬ 
bits  and  unstable  steady  states.  Our  approach  is 
as  follows:  We  first  determine  some  of  the  un¬ 
stable  low-period  periodic  orbits  and  unstable 
steady  states  that  arc  embedded  in  the  chaotic 
attractor.  We  then  examine  these  orbits  and 
choose  one  which  yields  improved  system  per¬ 
formance.  Finally,  we  apply  small  controls  so  as 
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to  Stabilize  this  already  existing  orbit. 

Some  comments  concerning  this  method  are 
the  following: 

(1)  Before  settling  into  the  desired  controlled 
orbit  the  trajectory  experiences  a  chaotic  tran¬ 
sient  whose  expected  duration  diverges  as  the 
maximum  allowed  size  of  the  control  approaches 
zero. 

(2)  Small  noise  can  result  in  occasional  bursts 
in  which  the  orbit  wanders  far  from  the  con¬ 
trolled  orbit. 

(3)  Controlled  chaotic  systems  offer  an  advan¬ 
tage  in  flexibility  in  that  any  one  of  a  number  of 
different  orbits  can  be  stabilized  by  the  small 
control,  and  the  choice  can  be  switched  from  one 
to  another  depending  on  the  current  desired 
system  performance. 

Although  we  describe  the  details  only  in  the 
case  of  discrete  time  systems,  this  method  is 
applicable  in  the  continuous  time  case  as  well  by 
considering  the  discrete  time  system  obtained 
from  the  induced  dynamics  on  a  Poincare 
section. 

In  order  to  illustrate  the  method  we  apply  it  to 
a  jjeriodically  forced  mechanical  system  (the 
kicked  double  rotor),  which  results  in  a  four¬ 
dimensional  map.  Amongst  the  examples  consid¬ 
ered,  we  study  cases  where  the  unstable  orbit  of 
the  uncontrolled  system  has  two  unstable  eigen¬ 
values  and  two  stable  eigenvalues,  and  the 
stabilization  is  achieved  by  variation  of  one  con¬ 
trol  parameter  characterizing  the  strength  of  the 
periodic  forcing.  The  present  paper  generalizes 
our  previous  work  [6]  to  the  case  of  higher¬ 
dimensional  systems  [7]  and  also  includes  new 
material  illustrating  the  effect  of  the  choice  of 
stabilization  on  the  length  of  the  chaotic  tran¬ 
sient  experienced  by  the  orbit  before  control  is 
achieved.  Other  relevant  references  on  the  feed¬ 
back  stabilization  of  periodic  or  steady  orbits 
embedded  in  chaotic  attractors  are  the  experi¬ 
ments  of  Ditto  et  al.  [8],  Singer  et  al.  [9],  and  the 
paper  of  Fowler  [10].  (Other  works  in  the  gener¬ 
al  field  are  listed  in  ref.  jllj.) 

The  plan  of  the  paper  is  as  follows.  In  section 


2,  we  give  an  implementation  of  the  method, 
initially  developed  in  ref.  [6],  by  using  the  “pole 
placement  technique”  [7,  12).  In  particular,  we 
address  the  problem  of  stabilization  of  periodic 
orbits  with  more  than  one  unstable  eigenvalue. 
We  also  discuss  experimental  implementation  in 
the  absence  of  an  a  priori  mathematical  system 
model  and  generalization  of  the  method  to  deal 
with  cases  where  delay  coordinates  embedding  is 
used.  In  section  3  we  present  some  results  for  the 
control  of  the  Henon  map  [13],  a  two-dimension¬ 
al  system  that  is  used  as  a  paradigm  in  the  study 
of  dynamical  systems;  these  results  extend  those 
given  in  ref.  [6]  in  directions  relevant  to  our 
present  study.  In  section  4  we  present  results  for 
the  control  of  the  double  rotor  map  [14],  a 
four-dimensional  system  that  describes  a  particu¬ 
lar  impulsively  periodically  forced  mechanical 
system.  Finally,  in  section  5  we  present  the  main 
conclusions  of  the  work. 

2.  Description  of  the  method 

2.1.  Formulation 

For  the  sake  of  simplicity  we  consider  a  dis¬ 
crete  time  dynamical  system, 

Z,-,,  =  F(Z„p),  (2.1) 

where  Z,  E  R",  p  E  IR  and  F  is  sufficiently  smooth 
in  both  variables.  Here,  p  is  considered  a  real 
parameter  which  is  available  for  external  adjust¬ 
ment  but  is  restricted  to  lie  in  some  small  in¬ 
terval, 

\p-p\<8,  (2.2) 

around  a  nominal  value  p.  We  assume  that  the 
nominal  system  (i.e.,  for  p  =  p)  contains  a  cha¬ 
otic  attractor.  Our  objective  is  to  vary  the  pa¬ 
rameter  p  with  time  i  in  such  a  way  that  for 
almost  all  initial  conditions  in  the  basin  of  the 
chaotic  attractor,  the  dynamics  of  the  system 
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converge  onto  a  desired  time  periodic  orbit  con¬ 
tained  in  the  attractor.  The  control  strategy  is 
the  following.  We  will  find  a  stabilizing  local 
feedback  control  law  which  is  defined  on  a  neigh¬ 
borhood  of  the  desired  periodic  orbit.  This  is 
done  by  considering  the  first  order  approxi¬ 
mation  of  the  system  at  the  chosen  unstable 
periodic  orbit.  Here  we  assume  that  this  approxi¬ 
mation  is  stabilizable.  Since  stabilizability  is  a 
generic  property  of  linear  systems,  this  assump¬ 
tion  is  quite  reasonable.  The  ergodic  nature  of 
the  chaotic  dynamics  ensures  that  the  state  tra¬ 
jectory  eventually  enters  into  the  neighborhood. 
Once  inside,  we  apply  the  stabilizing  feedback 
control  law  in  order  to  steer  the  trajectory  to¬ 
wards  the  desired  orbit. 

For  simplicity  we  shall  describe  the  method  as 
applied  to  the  stabilization  of  fixed  points  (i.e., 
period  one  orbits)  of  the  map  F.  The  considera¬ 
tion  of  periodic  orbits  of  period  larger  than  one 
is  straightforward  and  is  discussed  in  section  2.5. 
Let  denote  an  unstable  fixed  point  on  the 

attractor.  For  values  of  p  close  to  p  and  in  the 
neighborhood  of  the  fixed  point  Z^{p)  the  map 
(2.1)  can  be  approximated  by  the  linear  map 

Z,.,  -  Z*(P)  =  A[Z,  -  Z*(p)]  +  -  p) , 

(2.3) 

where  A  is  an  n  x  n  Jacobian  matrix  and  B  is  an 
n-dimensional  column  vector, 

A  =  D^FiZ,p),  (2.4) 

B  =  D^F(Z,  p) ,  (2.5) 

and  these  partial  derivatives  are  evaluated  at 

Z  =  Z,(p)  and  p=p.  We  now  introduce  the 
time-dependence  of  the  parameter  p  by  assuming 
that  it  is  a  linear  function  of  the  variable  Z,  of  the 
form 

p-p  =  -K^[Z,-Z,{p)].  (2.6) 

The  1  X  n  matrix  is  to  be  determined  so  that 
the  fixed  point  Z^{p)  becomes  stable.  Substitut¬ 


ing  (2.6)  into  (2.3)  we  obtain 

Z,,.  -  Z  Jp)  =  (A  -  BK'  )[Z^  -  Z,(  p-)]  .  (2.7) 

which  shows  that  the  fixed  point  will  be  stable 
provided  the  matrix  A  -  BK^  is  asymptotically 
stable;  that  is,  all  its  eigenvalues  have  modulus 
smaller  than  unity. 

The  solution  to  the  problem  of  the  determina¬ 
tion  of  K\  such  that  the  eigenvalues  of  the 
matrix  A- BK'^  have  specified  values,  is  well 
known  from  control  systems  theory  and  is  called 
“pole  placement  technique”  (see,  for  example, 
Ogata  [12]).  We  summarize  the  relevant  results. 

2.2.  Review  of  ihe  pole  placement  technique 

The  eigenvalues  of  the  matrix  A  -  BK  ’  are 
called  the  “regulator  poles”,  and  the  problem  of 
placing  these  poles  at  the  desired  locations  by 
choosing  with  A  and  B  given  is  the  “pole 
placement  problem”. 

Pole  placement  problem.  Determine  the  matrix 
in  such  a  way  that  the  eigenvalues  of  the 
matrix  A  -  BK^  have  specified  (complex)  values 
{p,, .  .  . ,  pj. 

The  following  results  [12]  give  a  necessary  and 
sufficient  condition  for  a  unique  solution  of  the 
pole  placement  problem  to  exist,  and  also  a 
method  for  obtaining  it  (Ackermann’s  method). 

( 1 )  The  pole  placement  problem  has  a  u.iique 
solution  if  and  only  if  the  n  x  «  matrix 

C  =  iB:AB\A^B  ■  ...  :  A"  '«), 

is  of  rank  n.  (C  is  called  the  controllability 
matrix). 

(2)  The  solution  of  the  pole  placement  prob¬ 
lem  is  given  by 

=  {a„-  a„  .  .  .  a,  -  a,)!  '  , 
where  T  =  CW,  and 
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j^n-\  ^n-2  • 

^n-2  ^n-3  ' 

W=  ; 

a^  1 

\l  0 

Here  {a,,...,a„}  are  the  coefficients  of  the 
characteristic  polynomial  of  A, 

|sl  —  A|  =  5"  +  ‘  +  •  •  •  +  a„  , 

and  (a,, .  .  .  ,  a„}  are  the  coefficients  of  the  de¬ 
sired  characteristic  polynomial  of  A  -  BK^, 

n%,(s  -  tlj)  =  s"  +  +  •  •  •  4-  a„  . 

2.3.  Control  parameter 


In  principle,  any  choice  of  regulator  poles  inside 
the  unit  circle  serves  our  purpose.  In  ref.  [6|,  the 
authors  made  a  very  special,  though  quite  natur¬ 
al,  choice  of  the  gain  matrix  K' :  the  resulting 
value  oi  p  ~  p  forces  the  orbit  onto  the  (linear) 
stable  manifold  of  the  fixed  point  at  each  itera¬ 
tion.  In  terms  of  regulator  poles  this  choice 
corresponds  to  setting  of  these  poles  equal  to 
the  stable  eigenvalues  of  matrix  A  and  the 
remaining  n  —  n.,  to  0.  In  terms  of  the  slab  (2.8) 
this  choice  corresponds  not  only  to  orientating  it 
parallel  to  the  stable  manifold  but  also  taking  an 
appropriate  width. 

The  choice  of  the  matrix  K  ^  will  be  discussed 
at  some  length  in  our  applications  of  the  method 
in  sections  3  and  4. 


a,  1 
1  0 

0  b 
0  0 


Our  considerations  so  far  are  based  on  the 
linear  eq.  (2.7)  and  therefore  only  apply  in  the 
local  region  near  Z*(p).  On  the  other  hand,  the 
limitation  in  the  size  of  the  parameter  perturba¬ 
tions  given  by  (2.2),  when  combined  with  (2.6), 
yields 

\K\Z-Z,{p)\\<8.  (2.8) 

This  defines  a  slab  of  width  25/|A^^|.  We  choose 
to  activate  the  control  according  to  (2.6)  only  for 
values  of  inside  this  slab,  and  we  choose  to 
leave  the  control  parameter  at  its  nominal  value 
(i.e.,  p  =  p)  when  Z,  is  outside  this  slab.  Other 
choices  are  possible. 

In  summary,  the  control  is  determined  by 

P-P  =  -K^[Z,-ZAp)] 

xuiS-\K^[Z,-ZAp)]]),  (2.9) 

for  arbitrary  Z^  [not  necessarily  close  to  Z*(p)], 
where  u  is  the  unit  step  function  defined  by 


2.4.  Time  to  achieve  control 

The  control  is  activated  (i.e.,  p  ¥=  p)  only  if  Z, 
falls  in  the  narrow  slab  (2.8).  Thus,  for  small  5,  a 
typical  initial  condition  will  execute  a  chaotic 
orbit,  unchanged  from  the  uncontrolled  case, 
until  Z,  falls  in  this  slab.  Even  then,  because  of 
nonlinearity  not  included  in  the  linearized  eq. 
(2.7),  the  control  may  not  be  able  to  bring  the 
orbit  to  the  fixed  point.  In  this  case  the  orbit  will 
leave  the  slab  and  continue  to  wander  chaotically 
as  if  there  was  no  control.  Since  the  orbit  on  the 
uncontrolled  chaotic  attractor  is  ergodic,  at  some 
time  it  will  eventually  satisfy  (2.8)  and  also  be 
sufficiently  close  to  the  desired  fixed  point  so 
that  control  is  achieved. 

Thus,  we  create  a  stable  orbit,  which,  for  a 
typical  initial  condition,  is  preceded  by  a  chaotic 
transient  [15-18]  in  which  the  orbit  is  similar  to 
orbits  on  the  uncontrolled  chaotic  attractor.  The 
length  T  of  such  chaotic  transient  depends  sensi¬ 
tively  on  the  initial  condition  of  the  particular 
orbit.  For  initial  conditions  randomly  chosen  in 
the  basin  of  attraction  the  distribution  of  chaotic 
transient  lengths  is  exponential  [15,  16], 


At  this  stage  it  should  be  pointed  out  that  the 
matrix  can  be  chosen  in  many  different  ways. 


exp(- , 


(2.10) 
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for  large  t.  The  quantity  (t)  is  the  characteristic 
length  of  the  chaotic  transient,  called  in  the 
present  case  the  average  time  to  achieve  control. 
Estimates  of  the  scaling  of  (t)  with  5  for  small  8 
are  given  in  Appendix  A  for  the  case  of  two- 
dimensional  maps. 

2.5.  Control  of  periodic  orbits  of  period  greater 
than  one 

The  analysis  of  periodic  orbits  given  in  sec¬ 
tions  2. 1-2.3  can  be  extended  to  nontrivial 
periodic  orbits  (i.e.,  orbits  with  period  greater 
than  one).  The  most  direct  way  is  to  take  the  Tth 
iterate  of  the  map,  where  T  denotes  the  period 
of  the  orbit  to  be  stabilized.  For  the  T  times 
iterated  map,  any  point  on  the  periodic  orbit  is  a 
fixed  point,  and  we  can  then  apply  the  discussion 
sections  2. 1-2.3.  This  method  is,  however,  over¬ 
ly  sensitive  to  nois'^,  especially  when  long  period 
periodic  orbits  are  involved.  Next  we  outline 
another  method  which  we  believe  should,  in 
general,  be  better.  In  terms  of  the  treatment  of 
section  2.2,  the  prescription  we  give  below  corre¬ 
sponds  to  placing  the  unstable  eigenvalues  of  the 
uncontrolled  problem  at  zero,  while  leaving  the 
stable  eigenvalues  unchanged.  (This  is  only  one 
of  many  possibilities  that  could  be  given.) 

We  denote  the  periodic  orbit  by  Z,*(p),  where 
Z(,  + j.),(p)  =  Z;*(p).  In  addition,  we  introduce 
the  set  of  T  matrices  A,  which  are  n  x  n  and  the 
set  of  T  column  vectors  fi,  which  are  of  dimen¬ 
sion  n,  where 

A,  =  =  p) , 

=  P), 

and  the  partial  derivatives  are  evaluated  at  Z  = 
Z(*(p)  and  p  =  p. 

Linearizing  as  in  eq.  (2.3),  we  have 

^i+l  ~  P) 

=  AjZ,-Z„(p))-HI,(p,+p-).  (2.11) 


Say  that  the  periodic  orbit  has  u  unstable 
eigenvalues  (i.e..  u  eigenvalues  with  magnitude 
greater  than  one)  and  s  stable  eigenvalues,  where 
u  +  s  =  n.  At  each  point  Z,,(p)  on  the  p  =  p 
periodic  orbit,  determine  vectors  (u,  ,.  u,  . 
i>, ,}  which  span  the  linearized  stable  subspace. 
Now  let 

^i.i  ~  ^i  +  u-\^,  +  u-2  '  '  '  +  /  +  ■ 

for  y  =  1,  2, .  .  .  ,  (m  -  1)  and 

~  I  :  ■  ■  ■  :  i®;  +  u-2  = 

•  *’/  +  «,!  •  ^i  +  u.2  ■  '  ■  ■  =  ^i  +  u.s) 

(One  choice  of  the  vectors  {u,  ,.  u,  ,....,  u,  J 
is  the  stable  eigenvectors  of  A,A,^i  .  .  .  A,,^^,.) 
The  controllability  condition  (analogous  to  that 
in  section  2.2)  is  that  C,  be  nonsingular.  The 
desired  result  for  the  control  is  then  specified  by 

p,-p  =  -Kj[Z,-Z,,{p)],  (2.12a) 

where 

Ac7  =  fcC-‘4>,,„,  (2.12b) 

and  K  denotes  an  n-dimensional  row  vector 
whose  first  entry  is  one  and  all  of  whose  remain¬ 
ing  entries  are  zeros. 

To  derive  eqs.  (2.12)  we  iterate  (2.11)  m  times, 

^i  +  M  ~  ^(i  +  u)*(p)  ~  ~ 

+  Pi-p)  +  <|),,2fi, *  1  ( P, .  1  -  P ) 

+  ---  +  fi/  +  „-i(P,..-i -P)-  (2.13a) 

We  then  demand  that  Z,  ^„  land  on  the  linearized 
stable  manifold  of  the  periodic  orbit  through  the 
point  Z,;,.„j,(p).  That  is,  we  choose  the  p’s  such 
that  there  exists  s  coefficients  a,,  a,,  ....  a,  such 
that 

^1  tu  ~  P  )  ~  “l*’/  *  u.l  ^2^1  U.2 


(2.1.3b) 
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Regarding  (2.13a)  and  (2.13b)  as  n  =  u  +  s  equa¬ 
tions  in  the  n  unknowns,  Pi,  Pj  +  x . 

a,,  a,, .  .  .  ,  ttj,  we  then  solve  for  p,  to  obtain 

(2.12). 

[Note  from  the  above  that  at  time  i  we  could, 
once  and  for  all,  calculate  all  the  control  parame¬ 
ter  values  to  be  applied  in  the  next  u  iterates,  p^. 
Pis- 11  •  ■  ■  1  Pi+u-\-  the  presence  of  noise,  how¬ 
ever,  this  is  not  a  good  idea  (assuming  u>  1), 
since  it  does  not  take  advantage  of  the  oppor¬ 
tunity  to  correct  for  the  noise  on  each  iterate. 
Therefore,  we  believe  that,  in  the  presence  of 
noise,  it  is  best  to  perform  the  calculation  of  p^ 
via  eq.  (2.12)  on  each  iterate.] 

2.6.  Use  of  delay  coordinates 

In  experimental  studies  of  chaotic  dynamical 
systems,  delay  coordinates  are  often  used  to 
represent  the  system  state.  This  is  sometimes 
useful  because  it  only  requires  measurement  of 
the  time  series  of  a  single  scalar  state  variable 
which,  we  denote  ^(r).  A  delay  coordinate  vector 
can  be  formed  as  follows; 

z(0  =  (^(0,  at -To),  at-2To),..., 
x^(t-MTo)), 

where  is  some  conveniently  chosen  delay 
time,  and  the  time  variable  t  is  assumed  continu¬ 
ous.  Embedding  theorems  guarantee  that  for 
M^2n,  where  n  is  the  system  dimensionality, 
the  vector  Z  is  generically  a  global  one-to-one 
representation  of  the  system  state.  (Actually,  for 
our  purposes,  we  do  not  require  a  global  embed¬ 
ding;  we  only  require  Z  to  be  one-to-one  in  the 
small  region  near  the  periodic  orbit,  and  this  can 
typically  be  achieved  with  M  =  n  -  1 .)  To  obtain 
a  map,  one  can  take  a  Poincare  surface  of  sec¬ 
tion.  For  the  often  encountered  case  of  a  system 
which  is  periodically  forced  at  a  period  Tp,  one 
can  define  a  “stroboscopic  surface  of  section”  by 
sampling  the  state  at  discrete  times  f,  =  iT^  + 

In  this  case  we  have  the  discrete  state  variable 


Z,  =  Z{t,) . 

As  pointed  out  by  Dressier  and  Nitsche  [19]. 
in  the  presence  of  parameter  variation,  delay 
coordinates  lead  to  a  map  of  a  different  form 
than 

z,,x-HZ„p,). 

which  is  the  form  assumed  in  sections  2. 1-2.5. 
For  example,  in  the  periodically  forced  case, 
since  the  components  of  Z,  are  ^(f,  -  mT^)  for 
m  =  0,  1, .  .  .  ,  A/,  the  vector  Z,^,  must  depend 
not  only  on  p,,  but  also  on  all  previous  values  of 
the  parameter  that  were  in  effect  during  the  time 
interval  MT^-  In  particular,  let  r  be 

the  smallest  integer  such  that  MT^  <  rT^.  Then 
the  relevant  map  is  in  general  of  the  form 

Z,*,  =  G(Z^,  p„  Pi-x, - Pi-r)-  (2.14a) 

For  r  =  1  we  have 

Z,.,=G(Z„p„p,_,).  (2.14b) 

We  now  discuss  how  the  technique  of  section 
2.2  can  be  applied  in  the  case  of  delay  coordi¬ 
nates,  and,  for  simplicity,  we  limit  the  discussion 
to  r=  1,  eq.  (2.14b).  Linearizing  as  in  eq.  (2.3) 
and  again  restricting  our  attention  to  the  case  of 
a  fixed  point  orbit,  we  have 

Z,.,-Z,(p')  =  A[Z,-Z,(p-)] 

+  +  -P)’  (215) 

where  A  =  D2G(Z,  p,  p').  =  DpG(Z,  p,  p'), 

and  flf,  =  Dp.G(Z,  p,  p').  and  all  partial  deriva¬ 
tives  are  evaluated  at  Z  =  Z*(p )  and  p  =  p  -  p'. 

Now  define  a  new  state  variable  with  one  extra 
component  by 


and  introduce  the  linear  control  law. 
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Pi-P=  -K^[Zi  -  Z,(p)l  -  A:(p,_ ,  -  p ) . 

(2.17) 


Combining  these  equations,  we  obtain 
Z,..  -  Z,(p)  =  (A  -  bF)[Z,  -  Z*(p-)] , 

(2.18) 

where 


Z*(p) 


A  = 


A  Bb 
0  0 


). 


to  test  their  theoretical  predictions  concerning 
the  average  time  to  achieve  control.  As  already 
pointed  out.  their  work  is  based  on  a  particular 
choice  of  the  gain  matrix  A^'.  In  this  section  we 
consider  how  different  choices  of  A'  affect  the 
average  time  to  achieve  control  for  the  Henon 
map. 

The  Henon  map  [13]  is  the  two-dimensional 
map 


Z-^Z'  =  F(Z)  , 


Since  (2.18)  is  now  of  the  same  form  as  (2.3), 
the  method  of  section  2.2  can  be  applied.  (A 
similar  result  for  any  r  >  1  also  clearly  holds.) 

Another  method  of  control  for  delay  coordi¬ 
nates  is  to  reduce  (2.14b)  directly  to  the  form 
Z,  +  ,  =  F{Zj,  Pj)  and  then  proceed  as  in  sections 
2.1  and  2.2.  This  reduction  can  be  done  by 
setting  Pj  =  p  for  every  other  time  step.  For 
example,  say  Pi  =  0  for  i  odd,  and  y  =  5/  for  even 
i.  Then  making  the  replacements  Z,— »  Z^,  Pi—*Pj 
for  even  i,  and  iterating  (2.14b)  twice  we  have 

Zy+i  =  G[G(Z,,  p,  pj),  pj,  p] 

^F{Zj,Pj),  (2.19) 

which  is  of  the  required  form.  We  believe,  how¬ 
ever,  that  the  first  method  we  have  given  [i.e., 
that  based  on  eq.  (2.18)]  should  usually  be  ca¬ 
pable  of  yielding  superior  results  to  the  method 
based  on  (2.19)  with  respect  to  noise  sensitivity 
and  time  to  achieve  control.  This  is  because  our 
second  method  does  not  take  advantage  of  the 
opportunity  to  control  on  each  time  iterate  while 
our  first  method  does. 


defined  by 

where  (x,  y)  E  R  x  IR.  We  keep  the  parameter  b 
fixed  throughout  {b  =  0.3)  and  allow  the  control 
parameter  a  to  vary  around  a  nominal  value  a 
(fl  =  1.4)  for  which  the  map  has  a  chaotic  at¬ 
tractor. 

For  a  =  a  =  1.4  there  is  an  unstable  saddle 
fixed  point  contained  in  the  chaotic  attractor. 
This  fixed  point  is  located  at 

Z*(a)  =  x,(fl)(J)  , 

x^{a)= -c  +  {c' +  a)''~  ,  c=|(l-Z?), 

for  a^-c^.  Noting  that  the  Jacobian  matrix  of 
partial  derivatives  of  the  map  is 

=  0). 

and  that  the  stability  of  the  fixed  point  is  de¬ 
termined  by  the  roots  of  the  characteristic 
equation 

|D^F[Z,(fl)]-sl|  =  0. 


3.  Controlling  the  Henon  map 

In  ref.  [6],  the  authors  used  the  Henon  map  to 
illustrate  the  control  method  and,  in  particular. 


one  can  easily  check  that  the  fixed  point  is  stable 
for  —c~<a<3c~  and  unstable  for  a>3c'. 
(Hence  the  fixed  point  is  unstable  for  b  =  0.3. 
a  =  a  =  1 .4  since  c  =  0.35.) 
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The  quantities  that  appear  in  section  2.2  are  as 
follows: 

C  =  («l/U>)  =  (J  T')- 

i).  T=(?  J), 

=  (a,  -  a,  aj  -  02)  , 
where 

a,  =2x*  = -(A„  +  A3) ,  a2  =  -/j  =  A„A3, 
and 


Fig.  1.  Henon  map;  choice  of  regulator  poles. 


“t  =  ~(Mi  +  Ah)  .  Ot2  =  fXilX2- 

Here  jc,  =  jc,(a),  and  A„  and  A,  are  the  eigen¬ 
values  of  matrix  A, 


The  quantities  /u,,  and  /tij  are  the  regulator  poles 
[i.e.,  the  eigenvalues  of  (A-BAT^)]. 

In  order  to  better  illustrate  the  different 
choices  of  regulator  poles  or,  equivalently,  of  the 
matrix  K^,  we  have  used  the  plane  (a,,  aj)  [cf. 
fig.  1].  In  this  plane  we  have  plotted  the  lines  of 
marginal  stability  /i,  =  ±1  (1  ±  a,  +  a2  =  0)  and 
=  1  (ttj  =  1);  the  bounded  triangular  region 
delimited  by  these  lines  (shown  shaded  in  the 
figure)  is  the  region  where  the  regulator  p>oles 
are  stable.  In  addition,  we  have  plotted  as 
dashed  lines  the  axes  {ki,  k2)  =  which  are 
related  to  {a^,  02)  by  the  translations 

kt  —  ot^  ~  a^  ,  k2  =  012  ~  Ot  . 

The  straight  solid  line  in  the  figure  going  through 
the  origin  of  the  {k^,  k^)  plane  has  slope  -  A3  and 
intersects  the  line  a2  =  0  at  the  point  Q  with 
coordinates  (a,,  Oj)  =  (-A3,0).  To  this  point 


corresponds  the  regulator  poles 

/t,  =  0  ,  /I2  =  A3  , 

and  the  matrix 

A^  =  A„(1  -A3)- 4. 

Kq  is  the  special  choice  of  matrix  made  in  ref. 

[61. 

Before  proceeding  with  the  discussion,  it  is 
convenient  to  express  the  vector  in  polar 
coordinates 

=  |A^|(cos  9,  sin  0)  . 

We  consider  the  following  two  ways  of  varying 
the  vector  (inside  the  triangular  region  of 
stability): 

(I)  6  fixed,  variable. 

(II)  |A^|  fixed,  0  variable. 

In  terms  of  the  control  slab  defined  by  eq 
(2.8)  we  have  that  in  situation  (1)  the  slab  is  kept 
orientated  in  a  fixed  direction  while  its  width 
w  =  2bl\K^\  varies,  whereas  in  situation  (II)  the 
direction  of  the  slab  is  rotated  while  its  width  is 
kept  fixed  at  w  =  25/|#Ly|.  The  choice  of  the  ' 
in  ref.  [6]  has  0  =  0^  -  tan  '(- A3)  and.  as  we 


F.J.  Romeiras  et  al.  /  Controlling  chaotic  dynamical  systems 


173 


shall  see,  this  choice  is  optimal  from  the  point  of 
view  of  the  time  to  achieve  control.  (To  see  that 
the  choice  of  ref.  [6]  corresponds  to  0  =  6^,  we 
note  that  with  this  choice  one  obtains  a  conver¬ 
gence  rate  to  the  periodic  orbit  of  as  in  ref. 
[6].) 

In  the  numerical  experiments  we  calculated 
the  average  time  to  achieve  control  by  the  meth¬ 
od  described  in  Appendix  B.  We  also  allowed  for 
different  values  of  the  maximum  amplitude  of 
the  parameter  perturbations,  S. 

First  we  consider  the  case  where  6  is  fixed 
(case  I)  at  the  value 

e  =  eQ. 

This  case  has  a  simple  interpretation  in  terms  of 
regulator  poles;  yxj  =  is  kept  fixed  while  /i,  is 
allowed  to  vary  between  -1  and  +1.  and 
are  related  by 

|if^|  =  U,-AJ(l-HA3^)•'^ 

Fig.  2  shows  results  for  (t)  for  this  case.  We  see 
that  the  average  time  to  achieve  control  in¬ 
creases  with  ^l^,  although  only  moderately. 

Fig.  3  shows  results  for  (t)  versus  Q  for 


fii 

Fig.  2.  Henon  map;  log,Q(r)  versus  ,  with  ^  =  A,,  for  (O) 
5  =  10'^  (□)  5  =  10“’,  (A)  5  =  10‘‘.  The  theoretical  curve 
was  calculated  using  eq.  (A. 9)  of  appendix  A.  (d=l.4, 
h-0.3). 


0 (degrees) 


Fig.  3.  Henon  map:  log,„(T)  versus  0.  with  jK^I  =  |Kyi.  for 
(O)  S=  10‘\  (□)  S  =  10  (A)  5  =  10  '  (a  =  1.4.  =  0.3). 

held  fixed  (case  II)  at 

\K^\  =  \Kl\  =  \Xji\  +  xl)'‘\ 

We  see  that  the  average  time  to  achieve  control 
has  a  strong  minimum  at  0  =  0q. 

Fig.  4  shows  (t)  versus  for  three  values 
of  6,  6  =  0,,,  0  =  6q,  and  0  =  0,,  where  0,,  <0q< 
0,  and  0„  and  0,  are  close  to  0q  (0,,  =  170.4°, 
0^,  =  171.1°,  0,  =  172.0°).  We  observe  that  the 
0  =  0Q  result  is  always  below  the  results  for 
0  =  0„  and  0  =  0,  indicating  that  the  average  time 
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Fig.  4.  Henon  map:  log,„(T)  versus  |A('|  for(O)  0  =  0^,.  (□) 
0  =  0„,  (A)  e  =  e,  (e„<ey<«,;  e„  =  tan 

(\  +  AJJ  =  170.4°,  0^,  =  tan  '(-A.)  =  171.1°.  0,  =  172.0°). 
and  6  =  10  '. 
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to  achieve  control  has  a  strong  minimum  at 
B  =  Oq  not  only  for  |Al^|  =  \K^\  but  for  all  values 
of  Thus  the  condition  0  =  Bq  is  optimal. 

In  Appendix  A  we  show  how  the  average  time 
to  achieve  control  can  be  obtained  theoretically 
in  the  case  of  two-dimensional  maps  and  verify 
that  there  is  excellent  agreement  between  the 
theoretical  and  experimental  results  in  the  case 
of  the  Henon  map. 


4.  Controlling  the  double  rotor 

In  this  section  we  apply  the  control  method 
described  in  section  2  to  a  dynamical  system 
known  as  the  double  rotor  map.  We  start  by 
deriving  the  map  (section  4.1  and  Appendix  B), 
then  study  its  fixed  points  (section  4.2)  and  its 
attractors  (section  4.3),  including  chaotic  ones, 
and  finally  proceed  to  control  some  of  the  fixed 
points  embedded  in  one  of  the  chaotic  attractors 
(sections  4.4  and  4.5). 

4.1.  The  double  rotor  map 

The  double  rotor  map  is  a  four-dimensional 
map  which  describes  the  time  evolution  of  a 
mechanical  system  known  as  the  kicked  double 
rotor  [14].  This  system  is  a  four-dimensional 
extension  of  the  kicked  (single)  rotor,  a  two- 
dimensional  system  that  is  described  by  the  well- 
known  dissipative  standard  map  [20]. 

The  double  rotor  is  composed  of  two  thin, 
massless  rods  connected  as  shown  in  fig.  5.  The 
first  rod,  of  length  /, ,  pivots  about  P,  (which  is 
fixed),  and  the  second  rod,  of  length  llj,  pivots 
about  P2  (which  moves).  The  angles  Bj(t),  ©2(0 
specify  the  orientations  at  time  t  of  the  first  and 
second  rods,  respectively.  A  mass  m,  is  attached 
at  Pj,  and  masses  5/712  are  attached  to  each  end 
of  the  second  rod  (P3  and  P4).  Friction  at  P, 
(with  coefficient  i',)  slows  the  first  rod  at  a  rate 
proportional  to  its  angular  velocity  0,(r)  =  dBj(t)/ 
dt;  friction  at  P2  (with  coefficient  1^2)  slows  the 
second  rod  (and  simultaneously  accelerates  the 


first  rod)  at  a  rate  proportional  to  02(0  “  ^i(^)- 
The  end  of  the  second  rod  marked  P3  receives 
periodic  impulse  kicks  at  times  t  -  T,2T, .  .  .  , 
always  from  the  same  direction  and  with  con¬ 
stant  strength  /q.  There  is  no  gravity. 

In  Appendix  C  we  write  the  differential  equa¬ 
tions  that  describe  the  kicked  double  rotor  and 
proceed  to  derive  from  them  the  double  rotor 
map  relating  the  state  of  the  system  just  after 
consecutive  kicks.  We  obtain  the  four-dimen¬ 
sional  map 

Z^Z'  =  F(Z) , 


defined  by 


MY  +  X  \ 

Ly +  G(Jir')/  ’ 


where 


(4.1) 


=  (^;)es'xs',  y  =  (;;). 


and 


(4.2) 


Xy,  X 2  are  the  angular  positions  of  the  rods  at  the 
instant  of  the  kth  kick,  x,  =  B^{kT).  while  >’2 
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are  the  angular  velocities  of  the  rods  immediate¬ 
ly  after  the  kth  kick,  =  dj{kT^).  S'  is  the  circle 
R(mod2'ii).  L  and  M  are  constant  2x2  matrices 
defined  by 


M  =  2 

;  =  i 


e"'^  -  1 


A  =  {vi  +  4pi) 


2-.U2 


A  ’ 


Finally,  c,  and  Cj  are  given  by 

^0  I  *  1 

^i~  'j  '  y  ~  1  ’  2  , 

where 


y=-u(3±v5), 

“}=i(l±|V5). 

b  =  -\yfl. 


In  all  the  numerical  work  described  in  the  rest  of 
this  section  the  parameters  v,  T,  /,  m^, 
and  U  were  kept  fixed  at  the  values 

v=T=l  =  m^  =  m2  =  (2  =  ^^ 

/,  =  1/V2. 

The  only  parameter  which  we  shall  vary  is  the 
forcing  used  as  the  control  parameter. 

4.2.  Fixed  points  of  the  double  rotor  map 

The  fixed  points  Z,  =  (A^*,y*)  of  the  map 
(4.1)  are  solutions  of  the  system 


I  =  (m,  -1-  m2)l\  =  ^3/2  ■ 

The  following  relation  between  matrices  L  and  M 
will  be  useful  below: 


L  =  H- A,M, 


where 


(4.3) 


X^  =  m^  +  X^-2TrS  , 


F*  =  LF*  +  G(Ar  J  , 


(4.5) 


where  the  components  of  the  vector  =  («,,  «,) 
are  integer  and  are  the  rotation  numbers  in  the 
X,,  X2  variables.  The  rotation  numbers  n,,  «2  are 
defined  as  the  multiples  of  2-17  by  which  x,*, 
are  increased  in  one  iteration  of  the  map  before 
being  brought  to  the  interval  [0,  27r].  From  eqs. 
(4.5)  we  obtain,  using  (4.3), 


( A, ,  A2  are  precisely  the  eigenvalues  of  A„.)  Note 
also  that 


e*‘'  -1  e^^^ -  1 
A,  A2 


IMI 


lAj  =  V,V2. 


From  now  on  we  assume  that  ^1  =  ^2  =  p.  This 
leads  to 


F.  =2'itM“W, 
G(X  J  = -2iTA,yV  . 


(4.6) 


Using  the  definitions  of  the  matrices  G  and  A,,  we 
rewrite  the  second  of  the  eqs.  (4.6)  in  the  form 

/sinx,.\  _  _  litpl  /(!//, )(-2«,  +«2)\ 
Vsinx,*/”  \  (l//2)(«,  -  n,)  / 
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where  the  identity  on  the  right  defines  the  two 
new  quantities  /q,  and^,.  These  equations  show 
that  for  each  pair  of  rotation  numbers  («,,  n,)  a 
set  of  four  possible  solutions  for  (-c,*,  x,*)  exists 
if  l/ol  ^  \focl  where  |/,,|  =  max(|y;„|,  l^l).  The 
four  fixed  points  correspond  to  the  four  combi¬ 
nations  of  values  of  (jt,*,  x,*)  that  have  the  same 
pair  of  values  of  (sin  x,*,  sin  Xj*).  When  neces¬ 
sary  we  will  use  the  notation 


^[Nq\ 


l-v;v| 

* 


or  more  simply  [N;  q\,  to  identify  the  fixed 
points,  where  the  index  q  labels  the  four  possible 
solutions  of  (4.7)  {q  =  1,2,  3, 4)  and,  as  shown 
in  fig.  6.  corresponds  to  the  ordering 


■*■1*  •*!*  ^*1*  *1* 

_  JV;3|  .  1A';21  _  |/V;4| 

*2*  ~  *2*  ^-*^2*  ~  *2* 


Note  that  =  (yj^’,  is  the  same  for  the 
four  fixed  points  (i.e.,  it  does  not  depend  on  q). 
Eqs.  (4.7)  also  show  that  for  l/nl^l/od. 
(n,,  Mj)  ^  (0»  0)>  another  set  of  four  fixed  points 


Fig.  6.  Double  rotor  map;  labeling  of  fixed  points. 


exists  With  rotation  numbers  (-/!,,-«,).  It  is 
easy  to  see  that  to  each  point  (x,*,  x,*,  y,*,  >’2*) 
of  the  first  set  corresponds  a  point  of  the  second 
set  given  by  (2-17  -  x,^.  27r  -  x,*, -y,*. -ts*)- 
This  is  a  reflection  of  the  fact  that  the  double 
rotor  map  (4.1)  itself  is  invariant  under  the 
change  of  variables  (x,,  x,.  y,,  y,)^-»(2'n- - 
X,,  271  -  X,,  -y,,  -y,). 

In  table  1  we  summarize  the  properties  of  the 
five  sets  of  fixed  points  (36  fixed  points)  with 
smaller  values  of  (when  the  other  parameters 
of  the  map  take  the  values  specified  by  eqs. 
(4.4)),  with  rotation  numbers  A^  =  (0,{)), 
±(1,  2),  ±(0,  1),  ±(1,  1),  ±(2,  3).  Note  that  the 
last  three  sets  have  the  same  value  of/,,..  In  fig.  7 
we  have  plotted  these  fixed  points  in  the  plane 
(x,,x,).  Their  (y,.  y,)  coordinates  are  given  by 
the  first  of  eqs.  (4.6). 

Let  us  now  turn  our  attention  to  the  stability 
of  the  fixed  points.  The  basic  element  of  the 
analysis  is  the  Jacobian  (4  x  4)  matrix  of  partial 
derivatives  of  the  map  (4.1), 

_  /  L  M  \ 

*>2^(2)- \^H(A:')  L  +  H(A")M/- 

where 

/ c,  cos  X,'  0  \ 

H(^')  =  D,W')  =  (  0  C3COSX;)' 

and  l„  denotes  the  n  x  n  identity  matrix.  The 
characteristic  polynomial  of  DgF{Z^)  is 

P(5l  =  |D2F(ZJ-^IJ 

=  |j%-j(L  +  L  +  HM)-i-L|,  (4.8a) 


Table  1 

Double  rotor  map:  fixed  points.  The  only  stable  fixed  points  are:  |(0. 0):4)  in  the  interval  0</',  <4.27.  .  .  ,  ;  [(1,2);4|  and 
((-1,  -2);  Ij  in  the  interval  2it</ <7.01  . .  .  [the  other  parameters  are  given  by  eq.  (4.4)|. 


fn 

(0,0) 

0 

0 

±(0,1) 

2ir /////, 

±(1.1) 

l-nvlll, 

±(2,3) 

2tti'///, 

f,r. 

L 

0 

0 

IvvUC 

2-aalll 

Itti'IJI, 

Itti'J/I, 

0 

2iri'///| 

litvlIL 

2-nvlll. 
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Fig.  7.  Double  rotor  map;  fixed  points  with  rotation  numbers 
(/ipM,).  The  symbol  (  +  )  denotes  fixed  points  with  one 
unstable  eigendirection,  while  the  symbol  0  denotes  fixed 
points  with  two  unstable  eigendirections.  l/o  =  9.0,  other 
parameters  given  by  eq.  (4.4)]. 

where,  for  simplicity,  we  have  set  H  =  H(A'*). 
The  characteristic  equation 

P{s)  =  0  ,  (4.8b) 

determines  the  stability  of  the  fixed  points:  if  all 
the  four  roots  have  modulus  smaller  than  one, 
the  fixed  point  is  stable.  The  stability  of  the  fixed 
points  as  determined  from  eqs.  (4.8)  is  discussed 
in  appendix  D. 

For  /o  =  9.0,  the  nominal  value  in  the  control 
experiments  of  sections  4.4  and  4.5,  all  the  fixed 
points  are  unstable.  We  have  indicated  in  fig.  7 
the  number  of  unstable  eigendirections  at  each 
fixed  point. 

We  observe,  from  eq.  (4.7),  that  as  the  forcing 
/o  increases,  the  number  of  fixed  points  increases 
without  bound.  Not  all  these  fixed  points  are 
necessarily  embedded  in  the  chaotic  attractor, 
but  those  that  are  embedded  in  it  are  necessarily 
unstable.  Furthermore,  we  find  that  the  fixed 
points  are  roughly  spread  throughout  the  attrac¬ 
tor,  suggesting  that  there  can  be  substantial  flex¬ 
ibility  to  select  among  a  variety  of  asymptotic 
behaviors  by  selecting  different  fixed  points  for 
control.  (Even  more  flexibility  can  be  achieved  if 


we  also  consider  periodic  orbits  of  period  greater 
than  one.) 

4.3.  Bifurcation  diagram 

A  bifurcation  diagram  shows  how  the  attrac¬ 
tors  of  a  dynamical  system  change  with  a  system 
parameter. 

In  figs.  8a,  8b  we  present  a  bifurcation  dia¬ 
gram  for  the  double  rotor  map,  which  was  ob¬ 
tained  in  the  following  way.  For  each  value  of 
the  parameter  we  took  a  large  number  of  initial 


/o 


fo 


Fig.  8.  (a),  (b)  Double  rotor  map:  bifurcation  diagram  (pa¬ 
rameters  given  by  eq.  (4.4);  number  of  values  of  /„  in  each 
figure:  2.51;  number  of  initial  condititms  for  each  /,  ■  b2.S; 
snapshot  taken  after  WKK)  iterations). 
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angles  (At,,;c2)  with  both  jc,  and  jCj  distributed 
uniformly  in  [0,  2'it]  and  iterated  them  starting 
with  zero  angular  velocity  [i.e., 

(0,0)].  After  iterating  a  sufficient  number  of 
times  so  that  the  orbits  are  essentially  on  the 
attractor,  we  plotted  the  Jt,  component  of  all  the 
orbits. 

The  diagram  clearly  exhibits  a  main  branch 
that  develops  continuously  for  all  values  of  the 
parameter.  This  main  branch  illustrates  a  period¬ 
doubling  bifurcation  sequence  to  chaos:  a 
period- 1  periodic  orbit  bifurcates  (at  /(,  ==4.27)  to 
a  period-2  periodic  orbit  which  then  bifurcates 
(at  /o  =  6.42)  to  a  period-4  periodic  orbit  which 
then  bifurcates  (at  =  6.67)  to  a  period-8  per¬ 
iodic  orbit,  and  so  on,  with  an  accumulation 
point  of  period  doublings  at  /q  — 6.75  beyond 
which  chaos  appears.  The  period-2  periodic  orbit 
in  the  sequence  results  from  the  bifurcation  of 
the  stable  orbit  =  (rr,  it,  0, 0)  discussed 

in  section  4.2  which  exists  for/, ^0;  at  the  value 
/o  °  *  at  which  this  orbit  becomes  unstable  the 
stable  period-2  periodic  orbit  is  born. 

Although  it  cannot  be  seen  in  the  diagram, 
this  period  doubling  sequence  is  peculiar  in  the 
following  sense;  what  appears  to  be  a  period-2'" 
periodic  orbit,  m^2,  is  in  fact  2  period-2'"”' 
periodic  orbits.  This  is  a  consequence  of  the 
symmetry  of  the  double  rotor  map  that  forces 
the  period-1  orbit  to  become  unstable  (at  f„- 
4.2)  through  an  eigenvalue  1  instead  through  -1 
as  occurs  in  the  normal  period  doubling  bifurca¬ 
tion  (an  example  of  which  is  the  bifurcation  of 
the  period- 1  periodic  orbit). 

Besides  this  main  branch,  there  are  other 
period  doubling  sequences,  one  of  which  starts 
with  a  period-4  periodic  orbit  (at  /,  =  3.42)  and 
ends  with  a  crisis  (at  /n  =  3.84).  (A  crisis  is  the 
sudden  disappearance  of  a  chaotic  attractor  by 
collision  with  an  unstable  periodic  orbit  [15,  16].) 

It  is  convenient  to  have  some  quantitative 
characterization  of  the  chaotic  attractors  re¬ 
vealed  by  the  bifurcation  diagram.  For  this  pur¬ 
pose  we  introduce  the  spectrum  of  Lyapunov 
exponents,  defined  as  follows  [21, 22]. 


Consider  an  n-dimensional  map  Z>->F(Z)  and 
its  Jacobian  matrix  of  partial  derivatives  J(Z)  = 
D^F(Z).  Consider  also  the  sequence 
{Z„,  Z, ,  .  .  .  ,  Z^_,}  generated  by  successive  iter¬ 
ation  of  the  initial  condition  Z,,.  For  this  se¬ 
quence  introduce  the  matrix 

J,  =J(Z,_,)J(Z, ,._,)...  J(Z,)J(Z„). 

Now  let 

C,{k)^C2{k)^--^C„ik)- 

denote  the  n  eigenvalues  of  (J^^J;)'  ",  where  J/ 
is  the  transpose  of  J^.  The  Lyapunov  numbers  of 
the  map  are  then  defined  by 

=  I™  ^  . ”  ' 

'  Ac— *x  ’ 

where  the  positive  real  Acth  root  is  taken.  They 
satisfy  the  same  ordering  as  the  Ci(k),  j  = 
The  Lyapunov  exponents  are  the 
logarithms  of  the  Lyapunov  numbers, 

^  =  >og,  7,, ,  /=1 . n, 

satisfying  the  same  ordering 

L,  >  L,  >  •  •  •  >  L„  . 

Hence,  for  chaotic  attractors  of  an  n-dimensional 
map  there  are  n  Lyapunov  exponents.  L^,  j  = 
,n.  A  chaotic  attractor  is  defined  to  be 
one  which  possesses  a  positive  Lyapunov  expo¬ 
nent,  L,  >  0. 

For  typical  dynamical  systems  the  Lyapunov 
exponents  are  the  same  for  almost  all  initial 
conditions  on  the  basin  of  attraction  of  the  at¬ 
tractor.  (This  is  true  in  particular  for  the  chaotic 
attractors  of  the  double  rotor  map  for  which  we 
calculated  Lyapunov  exponents;  these  results  are 
reported  below.)  Thus  the  spectrum  of  Lyapunov 
exponents  may  be  indeed  considered  to  be  a 
property  of  the  attractor.  For  maps  such  that  the 
determinant  of  the  Jacobian  matrix  is  indepen- 
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dent  of  the  variable  Z  the  Lyapunov  exponents 
satisfy  the  identity 

n 

SL-log,  Ml. 

;=i 

This  is  true  in  the  case  of  the  double  rotor  map 
for  which  we  have 

4 

E  L^  =  logjL|  =  (A, +  A,)7’=-3i/, 

y=i 

the  last  equality  applying  when  =  v-,  =  v  and 
T=  1. 

From  the  spectrum  of  Lyapunov  exponents 
define  the  Lyapunov  dimension. 


where  1  <  ~  1  is  the  largest  integer  for 

which  E V ,  Ly  >  0.  If  L,  <  0,  we  define  di^  =  0;  if 
Lj^O,  we  define  d^^  =  n.  (Note  that  d^^  =  n 
is  not  possible  in  the  case  of  dissipative  systems 
for  which  log^|J|<0)  Kaplan  and  Yorke 
[23,24]  conjecture  that  d^,  as  given  above  in 
terms  of  the  Lyapunov  exponents,  is  typically 
equal  to  the  fractal  dimension  of  the  support  of 
the  measure  of  the  attractor  (the  information 
dimension). 

We  have  numerically  calculated  the  Lyapunov 
exponents  and  the  Lyapunov  dimension  of  the 
chaotic  attractor  in  the  main  branch  of  the  bifur¬ 
cation  diagram  as  a  function  of  the  forcing/,,.  We 
used  the  method  described  in  refs.  [21,22]  to 
calculate  the  exponents  of  a  large  number  of 


orbits  in  the  basin  of  attraction  and  then  took  the 
average  of  these  values.  The  results  of  the  calcu¬ 
lation  at  evenly  spaced  values  along  the  /,  axis 
are  shown  in  fig.  9.  The  Lyapunov  dimension 
first  becomes  positive  at  the  onset  of  chaos  ( /,  — 
6.75).  The  attractor  dimension  goes  through  the 
integer  values  di  =  2  and  3  at  /„  —  6.88  and  12.7, 
respectively. 

In  the  numerical  experiments  on  control  that 
we  describe  in  sections  4.4  and  4.5  we  took 
/,  =  9.0  as  the  nominal  value  of  the  control  pa¬ 
rameter.  In  Table  2  we  list  the  corresponding 
values  of  the  four  Lyapunov  exponents  and  the 
Lyapunov  dimension.  In  order  to  illustrate  the 
point  made  above  regarding  the  fact  that  the 
Lyapunov  exponents  are  the  same  for  almost  all 
initial  conditions  on  the  basin  of  attraction  of  the 


6.0  7.0  8.0  9.0  10.0 

fo 


Fig.  9,  Double  rotor  map:  spectrum  of  Lyapunov  exponents 
and  Lyapunov  dimension  of  chaotic  attractors  versus  [eq. 
(4.4)). 


Table  2 

Double  rotor  map:  calculation  of  Lyapunov  exponents  and  Lyapunov  dimension  of  chaotic  attractor  (/,  =  9.0.  other  parameters 
given  by  eq.  (4.4);  number  of  initial  conditions  =  N„  =  256;  number  of  iterations  =  10000].  d,  =  2  +  (L,  +  L,)/|L,|  =  2.838. 


/ _ _ _ 

12  3  4 


1.205 

0.256 

-1.744 

-2.717 

1.182 

0.228 

-1.771 

-2.734 

1.229 

0.284 

-1.719 

-2.693 

0.00816 

0.0102 

0.(X)9I0 

0.00724 
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attractor,  we  also  give  some  details  on  the  nu¬ 
merical  calculation  of  these  exponents. 

We  have  now  described  in  sufficient  detail  the 
two  ingredients  necessary  to  the  application  of 
the  control  method  to  the  double  rotor  map: 
chaotic  attractors  and  fixed  points.  It  remains  to 
be  checked  if  the  fixed  points  determined  in 
section  4.2  are  embedded  in  the  chaotic  attrac¬ 
tor.  By  this  we  mean  that  any  neighborhood  of 
the  fixed  point  contains  an  infinite  number  of 
points  of  the  chaotic  attractor.  In  order  to  check 
this,  we  consider  the  intersection  of  the  attractor 
in  its  four-dimensional  phase  space  with  a  three- 
dimensional  hyperplane  containing  the  fixed 
points  Z*  that  we  wish  to  check.  Numerically  we 
approximate  the  hyperplane  by  a  very  narrow 
slab  through  each  fixed  point  of  the  form 

lje^(Z-ZJ|<w.  (4.9) 


unstable  directions  (that  is.  eigenvalues  with 
magnitude  bigger  than  one)  different  from  the 
number  of  unstable  directions  of  the  attractor 
(that  is,  positive  Lyapunov  exponents).  In  fact, 
from  the  observation  of  fig.  7  and  table  2,  we  see 
that  while  the  chaotic  attractor  for  f„  =  9.0  has 
two  positive  Lyapunov  exponents  some  of  the 
unstable  fixed  points  embedded  in  the  attractor 
have  only  one  unstable  eigenvalue. 

4.4.  Control 

We  now  proceed  to  apply  the  method  de¬ 
veloped  in  section  2  to  control  the  fixed  points  of 
the  double  rotor  map  with  control  parameter/,. 
Let  us  denote  by  Z*  the  fixed  point  to  be  con¬ 
trolled  at  the  nominal  value  /,  of  the  parameter. 
The  quantities  that  were  introduced  in  section  2 
now  take  the  following  particular  form: 


Actually  we  took  the  slabs  parallel  to  the  plane 
(jc, ,  jTj)  which  implies  that  each  slab  contains  the 
four  fixed  points  with  the  same  rotation  number. 
We  then  examine  a  very  long  orbit  and  plot  only 
those  points  satisfying  (4.9).  The  intersection  of 
our  2.8-dimensional  attractor  with  a  three  di¬ 
mensional  hyperplane  is  a  L8-dimensional  cross- 
section.  The  small  scale  structure  of  this  1.8- 
dimensional  intersection  is  somewhat  fuzzed  out 
due  to  the  finite  slab  thickness.  The  results,  for 
/o  =  9.0,  are  given  in  figs.  lOa-lOe,  which  refer  to 
the  rotation  numbers  A  =  (0,0),  (1,2),  (0,1), 
(1,1)  and  (2,3),  respectively.  In  these  figures 
the  relevant  fixed  points  are  denoted  by  a  -f 
symbol.  The  results  indicate,  with  different  de¬ 
grees  of  certitude,  that  the  first  four  sets  of  fixed 
points  are  indeed  embedded  in  the  attractor 
while  the  fifth  is  not.  Note  that  fig.  10a  nicely 
reveals  the  symmetry  of  the  map  with  respect  to 
the  point  (it,  tt,  0, 0).  Note  also  the  fractal-like 
structure  in  this  figure. 

We  conclude  this  discussion  by  mentioning 
what  seems  to  be  an  interesting  issue:  the  loss  of 
hyperbolicity  due  to  the  existence  of  fixed  points 
embedded  in  the  attractor  that  have  a  number  of 
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One  immediate  conclusion  that  can  be  drawn 
from  these  results  is  that  the  controllability  ma¬ 
trix  C  is  identically  zero  in  the  case  of  the  fixed 
points  with  rotation  numbers  A  =  (().0)  for 
which  sin  X,  *  =  sin  Jt,*  =  0.  Hence  these  points 
are  uncontrollable,  at  least  when  the  control 
parameter  is/,.  We  will  show  in  the  next  subsec¬ 
tion  that  this  set  of  fixed  points  can  be  controlled 
if  we  modify  the  double  rotor  map  to  allow  for 
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Fig.  10.  Double  rotor  map;  sections  of  chaotic  attractor  by  slab  \k^(Z  -  ZJ\<  w.  =  (0,  0,  I ,  I ).  w  =  10  through  the  fixed 
points  (  +  )  with  rotation  numbers  (a)  AT  =  (0, 0),  (b)  Af  =  (l,2).  (c)  JV  =  ((),  1),  (d)  Af  =  ( 1 .  1),  (e)  =  (2.  .1).  The  map  was  iterated 

10*  times  (/„  =  9.0,  eq.  (4.4)). 
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kicks  with  variable  direction  and  then  take  as 
control  parameter  the  angle  the  kicks  make  with 
the  vertical  direction  in  fig.  5. 

The  method  is  illustrated  in  figs.  11a,  11b.  The 
control  of  the  first  fixed  point  was  turned  on  at 
1  =  0  with  switches  to  control  other  fixed  points 
occurring  at  later  times.  We  plot  the  x,  and  x, 
coordinates  of  an  orbit  as  a  function  of  (discrete) 
time.  The  parameter  perturbations  were  pro- 


Fig.  U.  Double  rotor  map:  successive  control  of  fixed  points 
(1)  [(0, 1);4I.  (2)  1(0,  -I);  1).  (3)  [(0. 1);  1),  (4)  [{0.  -I);4|. 
The  arrows  indicate  the  times  of  switching.  The  regulator 
poles  correspond  to  projection  onto  the  stable  manifold 
[«=l.0,/,  =  9.0,  eq.  (4.4)]. 


grammed  to  control  successively  four  different 
fixed  points  of  the  set  with  rotation  numbers 
N  =  ±(0, 1).  The  times  at  which  we  switched  the 
control  from  stabilizing  one  fixed  point  to 
stabilizing  another  are  labeled  by  the  arrows  in 
the  figure.  The  figure  clearly  illustrates  the  flex¬ 
ibility  offered  by  the  method  in  controlling  dif¬ 
ferent  periodic  motions  embedded  in  the  attrac¬ 
tor.  The  figure  also  shows  that  the  time  to 
achieve  control  is  different  from  case  to  case. 

We  now  report  the  results  of  several  numerical 
experiments  that  were  carried  out  with  the  pur¬ 
pose  of  understanding  the  behavior  of  the  time 
to  achieve  control. 

The  first  experiment  was  intended  to  confirm 
that  the  time  to  achieve  control  indeed  follows 
an  exponential  probability  distribution  as  indi¬ 
cated  in  section  2.4.  We  proceeded  to  control  the 
fixed  point  ((0,  1);4]  by  starting  at  a  large  num¬ 
ber  of  different  points  on  the  attractor  and 
measuring  the  time  each  orbit  took  to  reach  the 
fixed  point.  We  then  obtained  the  distribution 
function  of  the  time  to  achieve  control  by 
plotting  a  histogram  of  t  using  bins  of  constant 
size.  The  results  are  presented  as  a  semilog  plot, 
in  fig.  12  and  show  excellent  agreement  with  the 
predicted  fit  to  a  straight  line. 

In  our  next  experiment  we  looked  at  the  de¬ 
pendence  of  the  average  time  to  achieve  control 
on  the  size  of  the  parameter  perturbations.  8. 
The  results  are  shown  in  fig.  13,  where  we  have 
used  logarithmic  scales  in  both  axes.  The  two 
fixed  points  [(0.  1);4J  and  ((0.  1);  1]  were  con¬ 
trolled.  (The  first  of  these  points  has  two  un¬ 
stable  eigenvalues  while  the  second  has  only  one 
unstable  eigenvalue.)  We  see  that  for  the  smaller 
values  of  8  the  results  closely  follow  straight  lines 
indicating  a  power  law  dependence, 

in  accord  with  the  theoretical  predictions  of  ref. 
[6]  for  two-dimensional  maps. 

In  the  experiments  described  until  this  point 
the  choice  of  the  regulator  poles  (eigenvalues  of 
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Fig.  12.  Double  rotor  map:  histogram  of  the  time  to  achieve 
control  T  of  a  sample  of  8192  orbits.  The  fixed  point  con¬ 
trolled  was  [(0,1);  4].  The  regulator  poles  correspond  to 
projection  onto  the  stable  manifold.  [5  =  1.0,  /„  =  9.0,  eq. 
(4.4)]. 

A  -  BK^)  corresponded  to  projection  onto  the 
stable  manifold  of  the  fixed  points.  That  is,  the 
stable  eigenvalues  of  matrix  A  were  left  un¬ 
changed,  and  the  unstable  eigenvalues  were 
shifted  to  zero. 

In  our  next  set  of  experiments  we  looked  at 
how  different  choices  of  regulator  poles  affect 


the  average  time  to  achieve  control.  We  consid¬ 
ered  the  fixed  point  [(0,  1);4]  with  two  unstable 
eigendirections  and  kept  two  of  the  regulator 
poles  equal  to  the  two  stable  eigenvalues  of  the 
fixed  point.  As  regards  the  other  two  regulator 
poles,  /i,  and  /u.^,  three  cases  were  considered; 

(1)  M2=0, 

(II)  fl2  =  H  ^ 

(III)  =  • 

/u,,  was  then  allowed  to  vary  in  the  interval 
(-1,1).  The  results  of  the  experiments  are 
shown  in  fig.  14.  In  cases  (1)  and  (II)  the  average 
time  to  achieve  control  essentially  increases  with 
/Lt, ,  indicating  behavior  similar  to  that  found  for 
the  Henon  map  in  fig.  2.  In  case  (III)  the 
average  time  to  achieve  control  passes  through  a 
broad  minimum.  (Note  that  the  point  /u,  =  /a,  = 
0,  which  is  common  to  the  three  cases,  corre¬ 
sponds  to  projection  onto  the  stable  manifold.) 

4.5.  f^-uncontrollable  fixed  points 

We  saw  that  the  set  of  four  fixed  points  with 
rotation  numbers  N  =  (0, 0)  could  not  be  con- 


Fig.  13.  Double  rotor  map:  log|„(T)  versus  log,,,  6  for  con¬ 
trol  of  the  fixed  points  (□)  [(0, 1);  1],  (A)  |(0,  l);4j.  The 
regulator  poles  correspond  to  projection  onto  the  stable 
manifold.  The  straight  lines  are  least  square  fits  to  the  data 
[excluding  the  last  nine  data  points  m  the  case  of  (□)], 
[/„  =  9.0,  eq.  (4.4)]. 
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Fig.  14.  Double  rotor  map:  log|„(T)  versus  /a,  for  (O) 
/i,  =0,  (□)  fi2  =  fi,.  (A)  /A;  =  -n,.  The  other  two  regulator 
poles  were  kept  equal  to  the  stable  eigenvalues  of  the 
uncontrolled  map.  The  fixed  point  controlled  was  [(().  I);4] 
[5=  1.0,/,  =  9.0.  eq.  (4.4)]. 
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trolled  by  changes  in  the  parameter  /„  because 
the  controllability  matrix  at  these  points  is  identi¬ 
cally  zero. 

We  show  now  that  these  fixed  points  can  be 
controlled  by  modifying  the  double  rotor  map  to 
allow  for  kicks  with  variable  direction  and  then 
taking  the  direction  of  the  kicks  to  be  the  control 
parameter,  with  the  nominal  value  correspond¬ 
ing  to  the  previously  fixed  direction. 

Let  us  assume  that  the  direction  of  the  kicks 
makes  an  angle  tfr  with  the  vertical  downward 
direction.  Going  back  to  the  derivation  of  the 
double  rotor  map  in  Appendix  B,  it  is  easy  to 
verify  that  the  introduction  of  kicks  with  variable 
direction  can  be  taken  into  account  by  simply 
replacing  the  function  G  used  in  the  definition  of 
the  map  and  given  by  eq.  (4.2)  by  the  new 
function 


C|  sin(x,  -  if/) 
C2  sin(jiC2  -  lA) 


)■ 


Taking  <A  to  be  the  control  parameter  with  varia¬ 
tions  around  the  nominal  value  tjf  =  0,  the  appli¬ 
cation  of  the  method  now  involves  the  following 
quantities: 


A  = 


—  ) 


H(X*) 


^//,  COSJC,^  0  \ 

/\  0  /2COSX2J’ 


0  0-  cos  X 


,,  -^COSX2.). 


7.0 
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Fig.  15.  Modilied  double  rotor  map:  successive  control  of 
fixed  points  (1)  [(0,0);  3).  (2)  [(0,0);  4]  by  kicks  of  variable 
direction.  The  arrows  indicate  the  times  of  switching.  The 
regulator  poles  correspond  to  projection  onto  the  stable 
manifold  [S  =  0.05,  f„  =  9.0.  eq.  (4.4)1. 


The  fixed  points  are  now  all  controllable  by  small 
perturbations  of  the  parameter  t/t  around  the 
nominal  value  <A  =  0. 

Figs.  15a,  15b  illustrate  the  control  of  the  fixed 
points  [(0,0);  3]  and  [(0,0);  4]  by  kicks  of  vari¬ 
able  direction.  The  parameter  perturbations 
were  programmed  to  control  the  first  of  these 
points  from  i  =  0  to  i  =  10“*  and  the  second  from 
i  =  10^  to  I  =  2  X  lO'*. 


5.  Discussion 

The  transient  phase  where  the  orbit  wanders 
chaotically  before  locking  in  to  a  controlled  orbit 
can  be  greatly  shortened  by  applying  the  tech¬ 
nique  discussed  by  Shinbrot  et  al.  [25],  In  the 
latter  paper  it  was  pointed  out  that  orbits  can  be 
rapidly  brought  to  a  target  region  on  the  attrac- 
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tor  (in  the  present  case  the  neighborhood  of  the 
periodic  orbit  which  we  wish  to  stabilize)  by 
using  small  control  perturbations  when  the  orbit 
is  far  from  the  neighborhood  of  the  periodic 
orbit  to  be  stabilized.  The  idea  was  that,  since 
chaotic  systems  are  exponentially  sensitive  to 
perturbations,  careful  choice  of  even  small  con¬ 
trol  perturbations  can,  after  some  time,  have  a 
large  effect  on  the  orbit  location  and  can  be  used 
to  guide  it.  Thus  the  time  to  achieve  control  can, 
in  principle,  be  greatly  shortened  by  properly 
applying  small  controls  when  the  orbit  is  far  from 
the  neighborhood  of  the  desired  periodic  orbit. 

One  issue  which  we  have  not  addressed  is  the 
effect  of  noise.  If  the  noise  remains  small,  it  may 
not  be  sufficient  to  kick  the  orbit  out  of  the 
neighborhood  of  the  chosen  periodic  orbit  where 
the  control  is  activated.  In  this  case,  the  orbit 
remains  near  the  desired  periodic  orbit  indefi¬ 
nitely.  However,  it  may  be  that  the  random 
noise  is  such  that  it  may  occasionally  kick  the 
orbit  far  enough  away  from  the  periodic  orbit 
that  the  orbit  falls  outside  the  small  controlled 
phase  space  region.  In  this  case,  after  the  orbit  is 
kicked  out  of  the  controlled  phase  space  region, 
it  wanders  chaotically  over  the  attractor  until  it 
falls  in  the  controlled  region  again.  Thus  there 
are  epochs  where  the  orbit  is  kept  near  the 
desired  orbit  interspersed  with  epochs  wherein 
the  orbit  wanders  chaotically  far  from  the  de¬ 
sired  orbit.  If  the  latter  are,  on  average,  relative¬ 
ly  much  shorter  than  the  former,  then  one  might 
still  regard  the  control  as  being  effective.  See  ref. 
[6]  for  numerical  experiments  on  this  effect  using 
the  Henon  map.  We  also  remark  that  the  proce¬ 
dure  discussed  in  the  previous  paragraph  [25]  can 
be  used  to  greatly  reduce  the  duration  of  the 
noise  induced  epiochs  where  the  orbit  bursts  out 
of  the  controlled  phase  space  region. 

In  this  paper  we  have  considered  the  case 
where  there  is  only  a  single  control  parameter 
available  for  adjustment.  While  generically  a 
single  parameter  is  sufficient  for  stabilization  of  a 
desired  periodic  orbit,  there  may  be  some  advan¬ 
tage  to  utilizing  several  control  variables.  There¬ 


fore,  the  single  control  parameter  p  becomes  a 
vector  (e.g..  ref.  [26]  discusses  the  case  where 
the  number  of  control  parameters  is  equal  to  the 
number  of  unstable  eigenvalues).  In  particular, 
the  added  freedom  in  having  several  control 
parameters  might  allow  better  means  of  choosing 
the  control  so  as  to  minimize  the  time  to  achieve 
control,  as  well  as  the  effects  of  noise. 

Finally  we  wish  to  point  out  that  full  knowl¬ 
edge  of  the  system  dynamics  is  not  necessary  in 
order  to  apply  our  technique  (see  also  ref.  [6]). 
In  particular,  we  only  require  the  location  of  the 
desired  periodic  orbit,  the  linearized  dynamics 
about  the  periodic  orbit,  and  the  dependence  of 
the  location  of  the  periodic  orbit  on  small  vari¬ 
ation  of  the  control  parameter.  Recently,  delay 
coordinate  embedding  [19.27]  has  been  utilized 
in  several  experimental  studies  (refs.  [8.28-31]) 
to  extract  such  information  purely  from  observa¬ 
tions  of  experimental  chaotic  orbits  on  the  at¬ 
tractor  without  any  a  priori  knowledge  of  the 
system  of  equations  governing  the  dynamics,  and 
such  information  has  been  utilized  to  control 
periodic  orbits  [9].  Hence,  application  of  our 
method  is  not  limited  to  cases  where  a  complete 
knowledge  of  the  system  is  available. 

In  conclusion,  we  have  demonstrated  that  cha¬ 
otic  dynamics  can  often  be  converted,  by  using 
only  a  small  feedback  control,  to  motion  on  a 
desired  periodic  orbit.  Furthermore,  by  switch¬ 
ing  the  small  control,  one  can  switch  the  time 
asymptotic  behavior  from  one  periodic  orbit  to 
another.  In  some  situations,  where  the  flexibility 
offered  by  the  ability  to  do  such  switching  is 
desirable,  it  may  be  advantageous  to  design  the 
system  so  that  it  is  chaotic.  In  other  situations, 
where  one  is  presented  with  a  chaotic  system, 
the  method  may  allow  one  to  eliminate  the  chaos 
and  achieve  greatly  improved  behavior  at  rela¬ 
tively  low  cost. 
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Appendix  A.  Time  to  achieve  control  in  the  case 
of  two-dimensional  maps 

We  assume  that  control  is  achieved  if  the  orbit 
remains  in  the  slab  (2.8)  for  two  consecutive 
iterations  of  the  map.  The  two  conditions 

\K\Z  -  Z,(p))l  <  8  ,  \K\Z'  -  Z,(p-))1  <  5  , 

(A.l) 

define  a  control  “parallelepiped”  P^,  where  Z'  = 
F(Z,  p).  For  small  8,  an  initial  condition  will 
bounce  around  on  the  set  comprising  the  uncon¬ 
trolled  chaotic  attractor  for  a  long  time  before  it 
falls  in  the  control  parallelepiped  P^.  At  any 
given  iterate  the  probability  of  falling  in  P^  is 
approximately  the  natural  measure  (see,  for  ex¬ 
ample,  [17, 18, 23])  of  the  uncontrolled  chaotic 
attractor  contained  in  P^.  If  we  follow  many 
orbits  this  probability  p,(Pe)  also  gives  the  rate  at 
which  these  orbits  fall  into  P^.  Thus  ii{P^)  is  the 
inverse  of  the  average  time  for  a  typical  orbit  to 
first  fall  in  P^, 

<t)-'  =  p(P,).  (A.2) 

An  estimate  for  it.(P^)  can  be  given  in  the  two- 
dimensional  case  [23]: 

p(^c)~/  Pksr’^'kul''"'' dy,du„ ,  (A.3) 

where  u,  and  denote  linear  coordinates  in  the 
stable  and  unstable  directions.  In  here  and  d, 
are  the  pointwise  dimensions  [1]  for  the  uncon¬ 
trolled  chaotic  attractor  at  the  fixed  point  in  the 
unstable  and  the  stable  directions,  respectively;  p 
is  a  normalizing  constant.  Assuming  that  the 


attractor  is  smooth  in  the  unstable  direction  we 
have  d^  =  \,  while  is  given  in  terms  of  the 
eigenvalues  at  the  fixed  point  [1,  17,  18]  by 

d  =  ‘QgeUul 
'  log,(l/|Aj)- 

In  order  to  determine  the  control  parallelepiped, 
we  need  to  obtain  Z'  in  the  neighborhood  of  the 
fixed  point  Z*(p)  with  a  better  approximation 
than  that  provided  by  the  linear  map  (2.3).  We 
therefore  take 

V'  =  AV  +  B(p-p)+  ^e(V.V)+  \Dip-p)-  . 

(A.4) 


where 


V=Z-Z,(p),  V'  =  Z'-Z,(p). 


A,  B  were  defined  by  (2.4),  (2.5)  and  Q,  D  are 
two  vectors  with  components  d^.  ()t  =  1.2) 
defined  by 


<7* 


/.““i  dx^ 


(Z  (P).  p)  ' 


dp  = 


dp^ 


(z  (p)-p) 


in  here  x*,  /*,  and  (A:  =  1,2)  denote  the 
components  of  the  vectors  X,  F  and  V.  Using 
(2.6)  to  eliminate  p-p  from  eq.  (A.4),  we 
obtain 


V'  =  AV  +  ie(V,  V)  -  B{K^V)  +  \D{K'^V)-  . 

(A.5) 

The  control  “parallelogram”  P^  will  therefore  be 
defined  by  the  two  equations 

\K^V\<8,  |aV'|<5.  (A.6) 

In  order  to  compare  with  the  numerical  ex¬ 
perimental  results  described  in  section  3  we  have 
carried  out  the  calculation  of  (A.3)  in  the  case  of 
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the  Henon  map.  Writing  this  map  in  the  form 


and  taking  a  to  be  the  control  parameter  while  b 
is  kept  fixed,  we  obtain 

:)■ 

K^V  =  k^v^  +  /c2^2  ’ 

K^V  =  +  (^2  -  k\-2x^k^)v^ 

+  -  k.^)v2  . 


Also  we  note  that  for  the  Henon  map  the  vari¬ 
ables  (y,,  U2)  and  (y,,  y„)  are  related  by 


■flc:) 


(A.7) 


where 

r,  =  (l  +  A^^''^  =  + 

Letting 

^  =  y,  ,  i7  =  y, +  ft;2,  t  =  k2lk^, 

and  using  (A.7)  to  change  the  variables  of  inte¬ 
gration,  eq.  (A. 3)  can  be  written  in  the  form 


~  Po  j~|<  /  'd^drj. 


(A.8) 


where 


and 


~  =  ~  ^ulAjlysCA^  -  AJ]‘'' . 

Ho  H 

The  integration  in  the  variable  in  the  direction 
of  the  straight  lines  K^V=  ±6  can  be  done  exact¬ 
ly.  On  the  contrary,  except  in  the  case  A,  =  0, 
the  integration  in  the  other  variable  does  not 
seem  to  be  possible  in  closed  form.  We  have 
therefore  resorted  to  numerical  integration  to 
obtain  the  results  presented  in  fig.  16  (see 
below) . 


Fig.  16.  Henon  map:  (theoretical)  curves  of  log,,,  ((r)/ 
(f))?)  versus  e  =  arg(A'^)  with  lA^I  fixed  (a)  and  of 
versus  with  8  fixed  (b).  The  x  (  +  ) 
denotes  the  reference  value. 
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An  accurate  analytical  approximation  can  be 
obtained  in  the  case  t=  'K  (slab  parallel  to  the 
stable  manifold)  in  the  limit  5— »0: 

M^c)~P.g(M,)5’'[l  +  0(5)],  (A.9) 

where  g  is  the  function  defined  by 

y  p,(|aJ  +  m,)^  ’ 

Pirn  <  Ml  <0  0<M|<Mim 
g(0)  =  4|Aj'V 


Appendix  B.  Numerical  method  for  calculating 

(t) 

In  this  appendix  we  describe  the  procedure 
used  in  sections  3  and  4  to  numerically  obtain  the 
average  time  to  achieve  control,  (t). 

From  (2.10)  we  obtain  the  fraction  of  chaotic 
transients  with  length  smaller  than  some  value 

T 

max  * 

^ma\ 

/  ‘l>(T)dT=l-exp(- 
0 


and 

7  =  jrf,  +  1  , 

-  Pr  I''!  1  _  1  _ 

d.  |1  -  rl'-  •  A.  ■ 

The  dependence  of  /x(PJ  on  3  given  by  (A.9)  is 
precisely  that  predicted  in  ref.  [6].  The  depen¬ 
dence  of  on  shows  very  good  agree¬ 

ment  with  the  experimental  results  -  see  fig.  2. 
Note  that  in  plotting  the  theoretical  curve  we 
used  as  normalizing  constant  p,  that  obtained  by 
least  square  fitting  the  theoretical  curve  to  the 
experimental  points. 

We  have  used  eq.  (A. 8)  to  study  the  depen¬ 
dence  of  (t)  =  l/p,(P^)  on  the  gain  vector  K^.  In 
fig.  16a  we  have  plotted  curves  of  (r)  versus 
6  =  arg(Ar^)  with  |Ar^|  kept  fixed  and  in  fig.  16b 
curves  of  <t)  versus  with  d  kept  fixed,  (t) 
was  normalized  to  its  value  at  the  point  Q  (see 
section  3  and  fig.  1).  The  results  show  that  (t) 
exhibits  a  strong  minimum  at  6  =  Oq  for  all  val¬ 
ues  of  |if^|  and  increases  slowly  with  |Ar^|  for  all 
values  of  6,  in  agreement  with  the  experimental 
results  of  figs.  3  and  4. 


and  the  average  length  of  the  chaotic  transients 
with  length  smaller  than 

^max 

0 


which  is  the  required  formula.  Note  that  p^  =  1, 
(t), = (t). 

The  numerical  procedure  to  calculate  the  aver¬ 
age  time  to  achieve  control  is  as  follows.  Take  a 
large  number  of  randomly  chosen  initial  con¬ 
ditions  and  iterate  each  of  them  with  the  uncon¬ 
trolled  map  [i.e.,  with  Z>->F(Z,  p)]  a  sufficient 
number  of  times  until  they  are  all  distributed 
over  the  attractor  according  to  its  natural  mea¬ 
sure.  Then  switch  on  the  control  as  specified  by 
(2.9)  and  determine  how  many  further  iterates 
are  necessary  for  A,  ^  orbits  to  fall  within  a 
circle  of  small  radius  centered  at  the  fixed  point. 
Letting  be  this  number  of  iterates  and  {tJ 
with  /  =  1, .  .  .  ,  Af  be  the  times  required  for  the 
Nf  orbits  to  fall  within  the  small  circle,  we  have 
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1 

i^2v 

;=l 


Finally  we  use  eq.  (P  1)  to  obtain  (t).  In  our 
numerical  experiments  described  in  sections  3 
and  4,  we  took  =  192,  Nf  =  121,  values  that 
led  to  a  good  compromise  between  accuracy  and 
computation  time. 


Appendix  C.  Derivation  of  the  double  rotor 
map 

The  equations  of  motion  of  the  kicked  double 
rotor  are 

-1  -7 

dt  V  60.  /  ddj  60.  '  ^ 

where  the  Lagrangian  function  L  is  the  differ¬ 
ence  between  the  kinetic  energy, 

K{e„e.j={i,d\  + 

and  the  potential  energy, 

K(0, ,  02,  r)  =  (/,  cos  0,  +  1 2  cos  02)/(O  ’ 

i.e.,  L-T-V,  and  Rayleigh’s  dissipation  func¬ 
tion  F  is 

F{e„e2)  =  +  {i>2i,{e2  - e,f . 

The  sequence  of  forcing  kicks  is  given  by  the 
semi-infinite  comb  of  delta  functions  of  period  T 
and  strength  /„, 


=  8(t-kT). 


(C.2) 


In  here  /,  and  /,  are  the  moments  of  inertia. 


/,  =  (m,  +  mj)/^  ,  I2  =  rn2l\, 


and  r-,,  1^2  are  the  coefficients  of  friction. 
Elimination  of  L  and  F  from  (C.l)  yields 


^  /0,\  /  -(I'l  +  W  0|  \ 

d/  \  0,  /  \  V-  -  r,  /  \  0,  / 


/(/,//,)sin0,^ 

^  (/://:)  sin  0,  ;  ' 


We  now  proceed  to  integrate  eq.  (C.3).  For 
simplicity  we  take  /,  =  /,  =  /. 

Since  the  effect  of  the  kicks  is  instantaneous 

(i.e.,/(/)  =  0,  for  f  #  A:r.  A:  =  1. 2 _ )  eqs.  (C.3) 

are  linear  between  successive  kicks.  In  particu¬ 
lar,  for  0  <  t  <  T.  eqs.  (C.3)  reduce  to 


d  /0, 


dt  \ej  ‘  (0, 


(C.4) 


This  system  can  be  easily  solved  by  the  usual 
methods  for  linear  differential  equations  with 
constant  coefficients.  Denoting  by  0,(0).  02(0) 
the  initial  angular  velocities  this  solution  is 


0:(O)/  ■ 


(C.5) 


where 


L(/)  =  S  e^' . 

;=i 

A,,  A2  are  the  eigenvalues  of  matrix  A,.. 

'^2!  ^  ~  ' 

A  =  {v]  +  Av\)'\ 

and  W,,  W2  are  the  constant  matrices 

:)■  ?)■ 

where 


The  position  of  the  rods  is  obtained  by  integra- 
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lion  of  eq.  (C.5).  Denoting  by  0|(O),  0;(O)  the 
initial  positions  one  obtains 


where 


M(0  =  J 

0 


2 

L(^)d^=2 

y=i 


Eqs.  (C.5),  (C.6)  completely  describe  the  mo¬ 
tion  of  the  rotor  for  0<t<T  (before  the  first 
kick). 

At  t=  T  the  kick  instantaneously  changes  the 
angular  velocity  of  each  rod  but  not  its  position; 
that  is,  the  angular  velocity  of  each  rod  is  discon¬ 
tinuous  at  t-  T,  while  the  position  is  continuous. 
Denoting  by  61^(7’*),  dj(T*),  /=  1,2  the  values 
of  0y(r),  6j(t)  just  before  and  just  after  the  kick  at 
t  =  7,  we  therefore  have 


©,(T-)  =  0,(T*)  =  0,(r),  (C.7) 

0^(7^) -^(7')  =  ^/^  sin  0,(7),  (C.8) 


for  /■  =  1,  2. 

The  solution  of  eqs.  (C.3)  for  T<t<2T  is 
identical  to  the  solution  of  the  linear  system  eq. 
(C.4)  for  0  <  t  <  7  except  that  the  initial  condi¬ 
tions  0y(O),  0,(0),  ;  =  1,2  are  replaced  by  0,(7), 
0/7"),;  =  1,2. 

The  solution  of  eqs.  (C.3)  is  a  composition  of 
the  solution  of  eqs.  (C.4)  with  the  effect  of  the 
kicks  at  r  =  7,  2  7, ....  To  study  the  dynamics  of 
the  rotor  it  is  natural  to  consider  only  the  state  of 
the  system  immediately  after  each  kick.  Thus  we 
obtain  from  (C.5)-(C.8)  the  double  rotor  map. 


(C.9b) 


where 

0‘*’ =  0,(A:7)  ,  7=1.2 

are  the  positions  of  the  rods  at  the  instant  of  the 
^th  kick,  and 

0‘**  =  0,(A:7")  ,  7=1,2 

are  the  angular  velocities  of  the  rods  immediate¬ 
ly  after  the  A:th  kick. 


Appendix  D.  Stability  of  fixed  points  for  the 
double  rotor  map 

The  coefficients  of  the  characteristic  equation 
(4.8)  depend  on  the  fixed  point  and  on  the 
forcing  /„  only  through  the  two  non-zero  ele¬ 
ments  of  the  matrix  H,  which  we  are  going  to 
denote  by  h^^  and  The  discussion  of  the 
stability  of  the  fixed  points  is  conveniently  car¬ 
ried  out  in  the  plane  (/Jn  -  ^22)  considering  the 
intersections  between  the  lines  of  marginal 
stability  of  the  characteristic  equation  (where 
one  of  the  roots  has  modulus  unity)  and  the 
“orbits”  described  by  the  paths  followed  by  the 
fixed  points  as  the  forcing  f„  is  varied. 

The  “orbits”  of  the  fixed  points  can  be  ob¬ 
tained  by  first  eliminating  (7=  1.2)  between 
the  two  equations 

f 

fa  sin  jc,  *  =  ,  /J„  =j  I,  cos  X,  *  , 

with  the  result 


/=T2, 


and  then  eliminating  /I,  between  these  two  equa¬ 
tions,  with  the  result 


which  is  the  equation  of  the  hyperbola  described 
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by  each  fixed  point  in  the  plane  (/Jn’  ^22)  when 
/o  is  varied.  It  should  be  pointed  out  that  sym¬ 
metric  fixed  points  with  respect  to  (x,*,  x,*,  yi*, 
y2,)^->(2'Tr-x,*,  2tt-X2*,  -y,*,  -y2*)  de¬ 
scribe  the  same  “orbit.”  The  lines  of  marginal 
stability  are  defined  by  the  equation  (see  eq. 
(4.8b)) 

P(e‘“)  =  0,  (D.l) 

where  a  can  take  values  in  the  interval  [0, 27r). 
When  a  =  0  or  a  =  tt  this  equation  simplifies 
considerably.  We  obtain: 

(i)  P(l)  =  0=lHMl  =  |H|lMl;as|M|?^0 
this  implies 

^11^22  “  0  ■ 

(ii)  P(-l)  =  0  =  12(I  -t-  L)  +  HM|  =  |H  +  R||M|, 
where 

R  =  2(A, -l-2M-‘); 

writing  R  =  2  this  leads  to 

^11^22  ^22^11  ^  11^22  (^11^22  ~  ^12^21)  ~  ®  ■ 

When  a  7^0,  tt  it  can  be  shown  that  the  eq. 
(D.l)  has  no  solutions  in  the  (real)  plane 
(h,,,  /122);  that  is,  there  are  no  lines  of  marginal 
stability  with  a  7^  0,  tt. 

In  fig.  17  we  have  plotted  in  the  plane 
(/i,i ,  ^22)  the  lines  of  marginal  stability  P(l)  = 
P(-l)  =  0  (for  the  parameter  values  given  by  eq. 
(4.4)).  The  bounded  region  between  these  lines, 
which  is  the  shaded  region  in  the  figure,  is  the 
only  region  of  the  plane  where  all  the  roots  of 
the  characteristic  equation  have  modulus  smaller 
than  unity.  We  have  also  plotted  the  “orbits”  of 
the  first  five  sets  of  fixed  points,  the  arrows 
indicating  the  direction  the  forcing  increases;  the 
critical  values  of  (h,,,h22)  at  /o  =  /o,,  which 
occur  on  the  lines  P(l)  =  0,  are  given  in  table  1. 
We  see  that  of  all  these  orbits  only  two  cross  the 
shaded  region:  one  corresponds  to  the  fixed 


Fig.  17.  Double  rotor  map:  stability  diagram  of  the  fixed 
points  with  rotation  numbers  (n,,n,)  [/i  =  9.0.  eq.  (4.4)|. 

point  [(0,0);  4];  the  other  to  the  fixed  points 
[(1,2);4]  and  [(-l.-2);l].  These  fixed  points 
are  therefore  stable  while  their  “orbits”  remain 
in  the  shaded  region  and  become  unstable  when 
the  “orbits”  cross  the  line  P(-1)  =  0;  that  is, 
they  are  stable  in  a  finite  interval  of  values  of/,, 
foc  ^  fou  ■  The  other  fixed  points  are  unstable 
for  all  values  of  /„.  The  values  of  are  also 
given  in  table  2. 

Fig.  17  only  applies  for  the  particular  values  of 
the  parameters  given  by  eqs.  (4.4).  For  other 
values  the  relative  positions  of  the  lines  of  margi¬ 
nal  stability  and  “orbits”  of  fixed  points  are 
different,  and  fixed  points  with  other  rotation 
numbers  may  be  stable.  In  general,  we  can  make 
the  following  statements  regarding  the  stability 
of  the  fixed  points:  those  with  rotation  numbers 
(0,0)  are  stable  over  an  interval  0 
all  the  others  are  either  stable  over  an  interval 
y («,.«,)  < or  are  always  unstable. 
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A  numerical  algorithm  is  presented  for  the  purpose  of  reducing  noise  from  a  discretely  sampled  input  signal  where  the 
underlying  signal  of  interest  has  a  broadband  spectrum.  It  is  designed  to  be  useful  even  if  the  clean  signal  is  contaminated 
with  100%  or  more  noise  (signal  to  noise  ratio  less  than  or  equal  to  zero).  The  method  is  based  on  time  delay  embedding 
using  coordinates  generated  by  local  low-pass  filtering,  which  we  call  a  low-pass  embedding.  The  singular  value 
decomposition  is  then  used  locally  in  embedding  space  to  distinguish  between  the  dynamics  and  the  noise. 

The  algorithm  is  evaluated  for  chaotic  signals  generated  by  the  Lorenz  and  Rossler  systems,  to  which  Gaussian  white 
noise  has  been  added. 


1.  Introduction 

For  nonlinear  systems,  the  application  of  tradi¬ 
tional  linear  data  filtering  techniques  is  prob¬ 
lematic.  In  the  case  of  data  from  chaotic  systems, 
for  example,  the  power  spectrum  of  the  signal  of 
interest  as  well  as  the  noise  may  be  broadband. 
In  that  case,  filters  that  depend  on  differentiating 
signal  from  noise  on  frequency  considerations 
are  bound  to  have  difficulty  in  finding  the  signal. 

Embedding  methods  for  the  analysis  of  chaotic 
experimental  data  have  proved  to  be  quite  use¬ 
ful.  According  to  the  theory  of  embedding  [1,2], 
if  the  received  signal  is  generated  by  a  suffi¬ 
ciently  generic  measurement  function  on  the 
dynamical  attractor  of  the  system,  then  the  at¬ 
tractor  and  its  dynamical  properties  can,  in  prin¬ 
ciple,  be  reconstructed  from  the  signal. 

We  use  the  ideas  of  embedding  to  design  an 
algorithm  that  can  separate  additive  noise  from  a 
deterministic  signal.  The  algorithm  is  relatively 
simple  to  implement  in  the  sense  that  the  sophis- 
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ticated  ingredients,  the  Fourier  transform  and 
the  singular  value  decomposition,  are  commonly 
available  numerical  algorithms. 

Our  assumption  on  the  incoming  data  is  that  it 
is  the  result  of  a  continuous  process  sampled 
discretely  and  evenly  in  time.  We  also  assume 
that  the  noise  is  additive,  and  that  we  have  an 
estimate  of  the  level  of  the  noise.  The  algorithm 
is  an  iterative  procedure.  On  each  pass  through 
the  data,  corrections  are  added  to  the  data. 
There  are  four  main  steps  to  the  algorithm. 

First,  the  data  is  embedded  using  coordinates 
that  are  smoothed  locally  in  time  using  the 
Fourier  transform.  One  can  think  of  this  process 
as  a  “low-pass  embedding”.  For  a  chosen  win¬ 
dow  size  of  length  w,  the  discrete  transform  on  w 
points  is  evaluated.  The  Fourier  components  cor¬ 
responding  to  the  \n  lowest  frequencies  are  kept 
for  some  even  integer  n<w.  Since  the  compo¬ 
nents  are  complex  numbers,  this  corresponds  to 
n  independent  degrees  of  freedom.  The  inverse 
Fourier  transform  on  n  points  then  yields  a 
smoothed  version  of  a  windowed  section  of  the 
signal.  The  embedding  coordinates  into  R"  are 
the  n  numbers  output  from  the  inverse  Fourier 
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transfonn:  in  other  words,  n  evenly  spaced  sam¬ 
ples  of  this  smoothed  section  of  length  w. 

The  effect  of  this  step  is  to  smooth  the  data 
locally  in  time  so  that  meaningful  neighborhoods 
can  be  organized  in  the  second  step.  Additional¬ 
ly,  the  low-pass  embedding  enables  us  to  analyze 
information  which  is  nominally  w-dimensional  in 
n-dimensional  space.  If  n  is  small  compared  to 
w,  this  can  make  a  significant  difference.  In  the 
third  step,  when  the  corrections  to  the  data  are 
decided,  this  smoothing  will  be  dropped,  and  the 
corrections  will  be  made  to  the  actual  un¬ 
smoothed  data. 

The  second  step  is  to  organize  the  embedded 
points  in  neighborhoods  of  size  r,  where  r  is  a 
rough  estimate  of  the  size  of  the  noise.  The 
effect  of  the  first  step  was  to  try  to  insure  that  all 
points  in  a  neighborhood  of  radius  r  have  been 
perturbed  from  the  same  branch  of  the  attractor. 

The  third  step  is  to  project  the  points  in  the 
neighborhood  onto  the  attractor,  or  at  least  to 
push  the  points  in  that  direction.  To  do  this,  we 
use  the  singular  value  decomposition  [3]  to  calcu¬ 
late  the  principal  directions  of  the  set  of  vectors 
which  connect  a  fixed  base  point,  say  the  center 
of  mass  of  the  neighborhood  points,  to  the  em¬ 
bedded  points.  The  idea  is  to  correct  the  raw 
data  by  projecting  into  a  few  principal  directions, 
as  preferred  by  the  locally  smoothed  data.  The 
use  of  the  SVD  to  determine  the  global  and/or 
local  principal  directions  of  an  embedded  attrac¬ 
tor  was  first  discussed  in  refs.  (4,  5].  In  ref.  [6], 
these  ideas  were  applied  to  increasing  the  ac¬ 
curacy  of  estimates  of  the  correlation  dimension 
of  attractors. 

Fourth,  after  the  corrections  necessary  to  pro¬ 
ject  the  points  onto  the  principal  directions  are 
determined,  we  make  use  of  the  fact  that  the 
noise  is  uncorrelated  with  the  signal  of  interest. 
(In  fact,  depending  on  the  application,  this  may 
be  the  definition  of  noise.)  For  each  embedding 
coordinate,  the  random  noise  in  that  coordinate 
has  expected  value  zero.  Therefore,  to  minimize 
the  introduction  of  new  correlations  in  the  noise 
from  our  algorithm,  we  postprocess  the  correc¬ 


tions  to  ensure  that  they  add  to  zero  as  well.  The 
postprocessing,  which  we  call  the  "correction 
decorrelation  step",  consists  of  averaging  the 
corrections  for  the  given  coordinate  across  the 
points  in  the  small  neighborhood.  Then  the  aver¬ 
age  is  subtracted  from  each  correction. 

Finally,  a  multiple  of  the  calculated  correction 
is  added  to  the  raw  data.  The  multiple  is  a 
number  between  0  and  1,  is  small  for  the  first 
pass  through  the  data  set,  and  is  incremented  by 
a  fixed  value  for  each  additional  pass,  as  the  data 
becomes  more  consistent  with  a  deterministic 
process. 

To  test  the  effectiveness  of  the  method  in 
reducing  noise,  we  generate  clean  signals  from 
computer  simulations  of  deterministic  systems, 
add  Gaussian  white  noise  of  a  given  level,  and 
then  apply  the  method.  The  systems  used  for  this 
study  are  the  Lorenz  and  Rossler  systems.  Re¬ 
sults  are  summarized  in  tables  1  and  2. 

Recently,  several  noise  reduction  methods  for 
nonlinear  signals  have  been  proposed  which  are 
in  some  way  based  on  embedding  ideas.  Kos- 
telich  and  Yorke  [7,  8]  describe  a  method  which 
uses  local  linearizations  constructed  from  the  raw 
data  to  adjust  the  signal.  The  purpose  is  to 
minimize  the  self-inconsistency  of  the  signal  as 
much  as  f>ossible  while  changing  the  signal  as 
little  as  possible.  This  method  has  achieved  sig¬ 
nificant  noise  reduction  in  cases  of  low  noise. 
Schreiber  and  Grassberger  [9]  have  more  recent¬ 
ly  introduced  a  simpler  method  which  relies  on 
local  linearizations  which  seems  to  have  similar 
properties. 

Hammel  [10]  and  Farmer  and  Sidorowich  [11] 
have  proposed  methods  which  use  local  lineariz¬ 
ations  and  adjustments  along  the  stable  and  un¬ 
stable  directions  of  the  dynamical  trajectory. 
This  noise  reduction  idea,  referred  to  as  the 
refinement  step  in  refs.  [12-14],  stemmed  from 
the  development  of  computer-assisted  methods 
for  the  rigorous  verification  of  existence  of 
shadowing  trajectories  in  computer  simulations. 
This  approach  can  reduce  noise  several  orders  of 
magnitude  when  the  underlying  dynamics  is 
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known  a  priori,  but  so  far  has  not  been  demon¬ 
strated  to  work  when  the  dynamics  is  unknown. 
For  situations  where  the  dynamics  is  unknown 
but  a  true  reference  orbit  is  available  a  priori, 
Marteau  and  Abarbanel  [15]  give  a  method 
which  can  work  in  the  presence  of  a  significant 
amount  of  noise. 

A  method  which  is  closely  related  to  the  pres¬ 
ent  work  is  given  by  Cavdey  and  Hsu  [IfiJ.  As  in 
our  approach,  instead  of  constructing  local 
linearizations  to  the  dynamics,  they  try  to  project 
noisy  data  onto  the  attractor.  It  may  be  that  the 
projection  approach  is  less  sensitive  to  errors  in 
the  data  than  constructing  local  linearizations, 
and  so  is  more  likely  to  reduce  noise  in  a  high- 
noise  context.  In  the  present  method,  like  the 
methods  of  refs.  [7,  9, 16],  no  prior  knowledge  of 
the  underlying  attractor,  dynamics,  stable  and 
unstable  directions,  or  reference  orbit  is  re¬ 
quired. 


2.  Mathematical  background 

Assume  that  a  system  is  evolving  according  to 
a  set  of  ordinary  differential  equations,  and  that 
the  dynamics  is  confined  to  a  finite-dimensional 
attractor  A  in  the  phase  space  defined  by  the 
equations.  By  the  definition  of  phase  space,  the 
present  state  of  the  system  is  uniquely  defined  by 
a  point  in  phase  space.  The  measured  signal 
from  the  system  can  be  viewed  as  the  evaluation 
of  an  observation  function  h  from  phase  space  to 
the  real  numbers.  If  jc,  is  the  point  on  the 
attractor  that  describes  the  state  of  the  system  at 
time  /,  then  the  time  series  is  produced  by 
evaluating  the  observation  function  h  at  the  state 
Xf.  The  /th  point  of  the  time  series  is  5,  = 

The  embedding  method  involves  using  the 
series  of  signals  to  construct  a  map  from 
the  underlying  attractor  A  in  phase  space  to  a 
reconstruction  space  R".  Another  way  to  view 
this  map  is  that  n  coordinates  are  being  defined 
that  will  reconstruct  the  attractor,  if  the  coordi¬ 


nates  are  independent  and  numerous  enough. 
This  general  approach  was  advocated  in  1980  by 
Packard  et  al.  [17],  in  which  the  authors  attribute 
to  Ruelle  the  idea  of  using  delay  coordinates. 
Roux  and  Swinney  [18]  were  among  the  first  to 
analyze  laboratory  data  using  delay-coordinate 
reconstructions. 

To  be  specific,  a  possible  choice  for  the  em¬ 
bedding  map  £:  is 


E{x,)  =  (h(x,),  hix,^^) . ,)) 

=  (^,.5,. . • 


(2.1) 


Takens  [1]  proved  that  if  A  is  a  smooth  manifold 
of  dimension  d,  if  the  reconstruction  dimension  n 
is  at  least  2d  -l-  1 ,  and  if  the  dynamics  on  A  and 
the  observation  function  h  are  topologically 
generic,  then  the  embedding  map  will  be  a  dif- 
feomorphism.  In  ref.  [2],  it  is  proved  that  the 
same  conclusion  holds  if  h  is  generic  in  a  mea¬ 
sure-theoretic  sense  (i.e.  probability-one)  and 
furthermore  is  true  in  the  case  that  A  is  perhaps 
not  a  smooth  manifold,  but  rather  a  fractal  at¬ 
tractor.  In  the  latter  case,  the  reconstruction 
dimension  n  must  be  strictly  greater  than  twice 
the  box-counting  dimension  d  of  the  attractor  A. 

A  more  general  choice  than  (2.1)  is  p)ossible. 
In  ref.  [2]  it  is  shown  that  the  attractor  can  as 
well  be  reconstructed  using 


F(x,)=B 


h{x,) 


where  n  and  B  is  an  n  x  w  matrix  which 
avoids  collapsing  periodic  points  of  the  underly¬ 
ing  dynamical  system  of  integral  period  less  than 
or  equal  to  w. 


3.  Low-pass  embedding 

In  the  present  case,  we  recommend  defining 
B=  AjAjA,  to  be  the  composition  of  the  fol¬ 
lowing  three  linear  operations: 
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A ,  =  FFT  (discrete  Fourier  transform)  of  order 

w; 

A2  =  sets  to  zero  all  but  the  lowest  \n  frequency 
contributions; 

/Ij  =  inverse  FFT  of  order  n,  using  the  remaining 
In  frequencies. 

The  theoretical  results  we  have  referred  to 
above  state  that  given  a  series  of  data  {5,  }  that  is 
noise-free,  the  hidden  attractor  A  and  the  recon¬ 
struction  F{A)  will  have  identical  topological  and 
dynamical  properties.  That  fact  motivates  using 
the  same  approach  in  practice,  where  the  data  is 
noisy  and  the  length  of  the  series  is  finite.  We 
embed  the  signal  using  F,  and  then  try  to  correct 
for  noise  by  locally  forcing  the  data  to  lie  along 
the  principal  directions  of  F(A). 

The  matrix  B  has  the  effect  of  a  low-pass  filter, 
local  in  time.  One  starts  with  a  length  w  section  s 
of  the  signal,  and  represents  it  in  the  reconstruc¬ 
tion  by  a  real  vector  Bs  of  length  n.  If,  for 
example,  w  =  64  and  n  =  16,  the  information  that 
is  contained  in  s  but  missing  from  Bs  is  essential¬ 
ly  the  upper  three-fourths  of  the  Fourier  spec¬ 
trum  (the  frequency  components  above  j  of  the 
Nyquist  frequency).  If  the  sampling  rate  is  not 
too  low,  it  is  likely  that  the  lower  one-fourth  of 
the  spectrum  will  be  sufficient  to  give  a  good 
approximation  to  the  length  tv  section  of  the 
clean  signal  underlying  the  noise. 

To  see  an  example  of  this,  begin  with  a  tv  =  64 
point  section  s  of  the  clean  Lorenz  signal  with 
sampling  period  At  =  0.05.  In  fig.  la,  64  points  of 
the  clean  signal  are  plotted  as  a  solid  curve,  and 
the  16-vector  Bs  is  plotted  as  16  small  boxes.  The 
boxes  lie  very  close  to  the  signal,  meaning  that 
little  information  has  been  lost  in  going  from  s  to 
Bs.  In  fact,  the  clean  signal  can  be  recovered  to 
reasonable  accuracy  from  the  boxes  by  doing  a 
16  point  FFT  back  to  frequency  space,  filling  in 
the  high-frequencies  contributions  with  zeros, 
and  then  transforming  back  with  a  64-point  FFT. 
Thus  we  trade  a  small  amount  of  information 
loss  for  a  great  deal  of  advantage  in  reducing  the 
dimensionality  of  the  embedding. 
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Fig.  1.  In  both  (a)  and  (b).  the  sixteen  boxes  shown  are  the 
result  of  low-pass  filtering  the  64-point  section  of  the  signal 
shown  by  the  solid  curve.  The  solid  curve  in  (a)  consists  of  64 
contiguous  values  of  the  x-coordinates  of  the  Lorenz  system 
sampled  at  At  =  0.05.  The  solid  curve  in  (b)  consists  of  the 
same  signal  with  1009J-  Gaussian  white  noise  added. 


Next,  begin  with  a  noisy  signal  s  and  repeat 
the  exercise.  In  fig.  lb,  the  dotted  curve  is  a  64 
point  clean  Lorenz  signal,  and  the  solid  curve  is 
the  clear  signal  with  100%  Gaussian  noise 
added.  The  small  boxes,  which  represent  the  16 
points  of  Bs,  track  the  underlying  clean  signal  to 
a  reasonable  degree.  For  purpose  of  recovering 
the  clean  signal,  it  can  be  argued  that  the  16- 
dimensional  vector  Bs  has  no  less  information 
than  the  64-dimensional  vector  s,  so  that  we  can 
safely  work  in  the  lower  dimension. 
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4.  Algorithm 

The  derivation  of  the  method  makes  assump¬ 
tions  on  the  input  data.  Specifically,  we  assume 
that  the  signal  is  the  output  of  a  deterministic 
dynamical  system  to  which  white  noise  has  been 
added.  While  the  degree  of  success  of  the  meth¬ 
od  will  presumably  depend  on  the  extent  to 
which  the  assumptions  hold,  it  may  still  be  of  use 
where  one  or  more  of  the  assumptions  fail. 

The  method  is  iterative  in  nature.  Given  the 
original  time  series  {sf.  I  i  ^  L},  one  pass  of 
the  algorithm  through  the  data  replaces  each  5, 
by  a  new  value  5'.  The  resulting  time  series 
should  be  a  less  noisy  version  of  the  original 
series.  Then  the  process  can  be  applied  to  the 
new  series  {s'f.  L},  and  so  forth. 

The  new  number  s\  is  determined  in  the  fol¬ 
lowing  way.  Several  estimates  r,y  of  the  correct 
value  of  Sj  are  generated,  and  their  average  r'  is 
computed.  Then  s]  =  5,  +  m(f,  -  5,),  where  0^ 
m  s  1  is  a  factor  fixed  for  the  entire  pass  through 
the  data.  We  typically  use  m  =  0. 1  for  the  first 
pass,  and  slowly  increase  m  to  0.5  throughout 
further  passes.  For  example,  we  used  m  =  nip  = 
max{0.1 -i-0.02(p  -  1),  0.5}  for  pass  p  on  the 
Lorenz  attractor  runs. 

Generating  the  estimates  of  the  true  value 
of  Si  is  a  four-step  process. 

1.  Low-pass  embedding 

An  embedding  dimension  n  and  window 
length  w  are  chosen.  There  will  depend  on  the 
nature  of  the  data.  For  the  low-dimensional  cha¬ 
otic  attractors  we  have  analyzed  so  far,  we  have 
typically  used  n  =  16.  The  value  of  w  should  be 
larger  for  finely  sampled  signals  in  order  to  make 
full  use  of  the  data.  We  suggest  that  the  window 
contain  substantially  more  than  one  characteris¬ 
tic  period  of  the  signal,  if  the  latter  can  be 
determined.  In  the  case  of  the  Lorenz  attractor, 
the  oscillations  have  an  average  period  of  rough¬ 
ly  1  time  unit.  If  the  sampling  rate  is  At  =  0.05, 
there  are  around  15-20  data  points  per  oscilla¬ 
tion.  Therefore  we  used  window  lengths  of  w  = 


32  and  64.  and  represented  that  section  of  the 
signal  with  a  vector  in  R'*’. 

The  embedding  step  is  to  replace  each  contigu¬ 
ous  section  (i’^,  .  •  ■  .  5,  „  i )  of  the  signal,  j- 
1 . L  -  w  -t-  1,  with  the  corresponding  point 

in  R".  To  accomplish  this,  apply  the  FFT  of 

order  w  to  (s^ . +  Then  apply  the 

inverse  FFT  of  order  n  to  the  \n  (lowest  fre¬ 
quency  contributions,  including  the  zero  fre¬ 
quency  contribution.  The  result  is  an  n-dimen- 
sional  real  vector  which  represents  the  original 
section  of  the  signal  of  window  length  w  in  the 
reconstruction. 

2.  Neighborhood  selection 

Begin  by  choosing  a  point  x  from  the  embed¬ 
ding  into  R"  and  a  neighborhood  radius  r.  For 
best  results,  the  distance  r  should  be  chosen  at 
around  the  noise  size  of  the  data.  Find  all  points 
x^  within  the  neighborhood  U  centered  at  x  with 
radius  r.  Define  c  to  be  the  center  of  mass  of  all 
points  in  U,  and  consider  the  vectors  u,  =  -  c. 

The  vectors  connecting  the  base  point  c  to  the 
other  points  within  distance  r  will  tend  to  be 
longer  in  the  direction  along  the  attractor  than 
transverse  to  the  attractor,  as  long  as  r  is  not 
chosen  to  be  too  small. 

The  number  of  neighborhoods  to  construct  can 
be  chosen  at  the  user's  discretion.  In  our  runs, 
we  did  the  following.  For  w  =  32,  we  constructed 
a  neighborhood  for  each  reconstructed  point  x^ 
that  had  not  yet  been  contained  in  a  neighbor¬ 
hood.  For  w  =  64,  we  constructed  a  neighbor¬ 
hood  for  every  second  as  long  as  it  had  not 
yet  been  in  a  neighborhood,  to  offset  the  extra 
computational  time  caused  by  the  doubling  of 
the  order  of  the  FFT. 

3.  Singular  value  decomposition 

For  each  neighborhood  that  contains  a  suffi¬ 
cient  number  of  points,  apply  the  singular  value 
decomposition  to  the  matrix  whose  rows  are  the 
vectors  connecting  the  vector  x,  to  the  base 
point  c.  Define  p(Uy)  to  be  the  projection  of  u, 
into  the  linear  space  spanned  by  the  right  singu- 
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lar  vectors  corresponding  to  the  M  largest  singu¬ 
lar  values,  where  M  is  a  predetermined  number 
that  roughly  approximates  the  local  dimension  of 
the  attractor.  We  found  that  M  =  2  was  most 
successful  for  signals  from  the  Lorenz  and  Ros- 
sler  systems. 

To  finish  the  step,  we  need  to  translate  p(uy)  G 
R"  back  into  a  corrected  value  of  (5^  ,  . . .  , 
This  calls  for  the  reverse  of  the  FFT 
smoothing  step  above.  That  is,  add  the  base 
point  c  back  to  p{Vj),  apply  the  order  n  FFT  to 
c  +  p{Vj),  and  use  the  resulting  \  n  complex  num¬ 
bers  as  the  \n  smallest  frequency  contributions 
to  an  order  w  FFT,  filling  in  the  high-frequency 
components  with  zeros.  Invert  the  FFT  to  get  a 
vector  (ii^y,  .  . .  ,  which  is  a  corrected 

version  of  (sj,  .  .  . , 

4.  Correction  decorrelation  step 
So  far  is  a  new  estimate  for  for 
each  Xj  E  U,  and  for  /c  =  0, . . . ,  w  -  1.  For  each 
k,  replace  the  estimate  U/.y+t  by 

1  V' 

~  ^j.i  +  k  ~  ~t  2  ~  ^i  +  k)  •>  ('^•0 

Xi^U 

where  Ly  denotes  the  number  of  points  in  V. 
After  the  t^j  have  been  computed  for  all  neigh¬ 
borhoods  U,  define  f,  to  be  the  average  over  j  of 
all  contributions  ,  each  of  which  is  an  approxi¬ 
mate  to  Sj.  Finally,  =  5,  +  m(f;  -  j,)  is  the 
replacement  for  s^  in  the  revised  signal. 


5.  Data 

The  main  goal  of  this  section  is  to  evaluate  the 
effectiveness  of  the  noise  reduction  method  for  a 
range  of  noise  levels  and  sampling  rates.  To  set 
up  a  controlled  experiment,  artificial  data  sets 
with  additive  Gaussian  white  noise  were  gener¬ 
ated  from  the  Lorenz  and  Rossler  systems.  We 
used  an  embedding  dimension  of  n  =  16  and  a 
window  size  of  w  =  32  or  w  =  64  for  all  runs 
reported. 


The  Lorenz  equations  [19]  are 
X  =  a(  -  j:)  , 

y  =  px  -  y  -  xz  ,  (5.1) 

z  =  -  fiz  +  xy  , 

where  the  parameters  are  set  at  the  standard 
values  O'  =  10,  p  =  28.  )3  =  8/3.  A  long  trajectory 
of  the  Lorenz  attractor  was  generated  using  the 
order  four  Runge-Kutta  method  with  step-size 
0.001.  To  construct  a  signal  with  sampling  period 
A/  =  0.05,  the  x-coordinate  of  every  50th  point  of 
the  trajectory  was  used  for  the  clean  signal.  For 
the  signal  with  sampling  period  Ar  =  0.10  every 
KXlth  point  was  chosen.  To  the  clean  signal  was 
added  uncorrelated  Gaussian  noise  whose  stan¬ 
dard  deviation  is,  respectively,  10%.  1(X)%.  and 
200%  of  the  RMS  Lorenz  signal  strength.  In  all 
cases  treated  in  this  section,  a  time  series  of  5(X)0 
points  was  used. 

The  results  of  the  application  of  the  method 
for  various  sampling  rates  and  noise  levels  are 
given  in  table  1.  The  signal-to-noise  (SNR)  in  dB 
units  is  defined  to  be  SNR  =  20  log,„(c/n). 
where  5,  =  c,  +  n,  is  the  clean  signal  plus  noise, 
and  where  c=(c,‘)’  ‘  and  n  =  {n]y  '  are  the 
root-mean-square  levels  of  clean  signal  and 
noise. 

In  comparing  gains  in  SNR,  it  is  important  to 
consider  the  dependence  on  sampling  rate.  In¬ 
creasing  the  sampling  rate  and  adding  uncorre¬ 
lated  noise  in  the  same  way  would  have  the 
effect  of  moving  more  of  the  noise  into  high 
frequencies,  away  from  the  dominant  signal  fre¬ 
quencies.  For  example,  in  the  oversampled  case 
of  1CX)%  noisy  Lorenz  data  with  A/ =  0.005, 
where  there  are  around  2(X)  sample  points  per 
oscillation,  the  method  achieves  a  gain  of  over 
21  dB,  much  higher  than  the  gains  shown  in  the 
table,  where  the  sampling  rates  are  lower. 

After  the  maximum  gain  in  SNR  has  been 
achieved,  one  may  apply  the  method  with  differ¬ 
ent  values  of  the  parameters,  e.g.  the  window 
length  w,  to  the  output  signal  in  hopes  of  further 
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Table  1 

Results  of  noise  reduction  method  applied  to  noisy  Lorenz  data. 


^t 

Avg.  pts. 
per  oscill. 

Noise 

level 

Window 

w 

Passes 

Orig. 

SNR 

Final 

SNR 

Gain 

(dBO 

0.05 

20 

10% 

32 

18 

20.0 

31.7 

11.7 

0.05 

20 

100% 

32 

58 

0.0 

13.8 

13.8 

0.05 

20 

100% 

64 

15 

0.0 

14.2 

14.2 

0.05 

20 

200% 

64 

17 

-6.0 

6.5 

12.5 

0.10 

10 

100% 

32 

19 

0.0 

11.7 

11.7 

reduction.  In  fact,  slightly  larger  gains  can  be 
achieved  this  way,  but  in  order  not  to  complicate 
our  reporting  of  results  for  the  basic  algorithm 
we  do  not  list  them  here. 

Algorithm  pterformance  for  the  specific  case 
At  =  0.05  and  100%  noise  is  illustrated  in  figs.  2 
and  3.  In  fig.  2  we  graph  the  amount  of  noise 
reduction  as  a  function  of  the  number  of  passes 
through  the  data.  For  this  sampling  rate  and 
amount  of  additive  noise,  the  w  =  64  version  of 
the  method  reaches  a  higher  level  of  noise  reduc¬ 
tion  than  the  w  =  32  version,  and  it  reaches  it 
with  fewer  passes.  Figure  3  illustrates  the  total 
noise  reduction  achieved  on  a  typical  200-point 
section  of  the  signal.  In  fig.  3a,  the  original  noisy 


Fig.  2.  Noise  reduction  achieved  for  the  Lorenz  data  with 
sampling  period  d(  =  0.0S  and  with  100%  noise  added  is 
graphed  against  the  number  of  passes  through  the  data.  The 
circles  and  diamonds  refer  to  application  of  the  method  with 
window  sizes  w  =  32  and  w  =  64,  respectively. 
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Fig.  3.  (a)  The  solid  curve  is  a  200  point  section  of  the  input 
signal,  Lorenz  data  with  sampling  period  At  =  O.OS  and  100% 
additive  white  noise.  The  dotted  curve  is  the  Lorenz  signal 
before  noise  was  added,  (b)  The  solid  curve  is  the  output  of 
the  noise  reduction  method  for  the  same  200  point  section  as 
in  (a).  The  dotted  curve  is  the  clean  signal,  as  in  (a). 
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Fig.  4.  (a)  The  solid  curve  is  a  200  point  section  of  the  input 
signal,  Lorenz  data  with  sampling  period  At  =  0.05  and  200% 
additive  white  noise.  The  dotted  curve  is  the  Lorenz  signal 
before  noise  was  added,  (b)  The  solid  curve  is  the  output  of 
the  noise  reduction  method  for  the  same  200  point  section  as 
in  (a).  The  dieted  curve  is  the  clean  signal,  as  in  (a). 


signal,  Lorenz  plus  100%  noise,  is  graphed  with 
a  solid  curve,  and  the  clean  Lorenz  signal  is  the 
dotted  curve.  In  tig.  3b,  the  same  section  of  the 
signal  is  shown  after  15  passes  through  the  data, 
again  plotted  against  the  clean  signal.  Figure  4 
shows  similar  information  as  fig.  3b,  but  for  a 
signal  to  which  200%  Gaussian  noise  has  been 
added,  and  for  a  different  section  of  200  points. 

The  Rossler  equations  [20],  motivated  by  the 
dynamics  of  chemical  reactions  in  a  stirred  tank, 
are 

x=-{y  +  z), 

y  =  X  +  ay  ,  (5.2) 

z  —  bx  -  cz  +  xz  , 

where  the  parameters  are  set  at  the  standard 
values  a  =  0.36,  b  =  0.4,  c  =  4.5. 

Table  2  summarizes  the  results  of  the  method 
applied  to  data  produced  by  sampling  the  x- 
coordinate  of  the  Rossler  equations.  The  sam¬ 
pling  period  of  Af  =  0.4  is  chosen  to  roughly 
match  the  Lorenz  data  in  terms  of  average  points 
per  oscillation.  The  noise  reduction  results  are  in 
the  same  general  range  as  those  for  the  Lorenz 
data. 


6.  Summary 

We  have  described  the  use  of  the  delay  coordi¬ 
nate  embedding  method  of  nonlinear  data  analy¬ 
sis  to  separate  additive  noise  from  a  perturbed 
signal.  The  method  uses  a  filtered  version  of 
delay  coordinate  embedding  called  a  low-pass 


Table  2 


Results  of  noise  reduction  method  applied  to  noisy  Rossler  data. 


Ar 

Avg.  pts. 
per  oscill. 

Noise 

level 

Window 

w 

Passes 

Orig. 

SNR 

Final 

SNR 

Gain 

(dB) 

0.4 

16 

10% 

32 

20 

20.0 

31.8 

11.8 

0.4 

16 

100% 

32 

45 

0.0 

13.5 

13.5 

0.4 

16 

100% 

64 

19 

0.0 

14.3 

14.3 

0.4 

16 

200% 

64 

40 

-6.0 

7.2 

13.2 
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embedding,  and  the  singular  value  decomposi¬ 
tion  to  project  the  input  signal  along  directions 
belonging  to  the  signal  of  interest. 

The  method  is  tested  on  artificial  data  gener¬ 
ated  by  the  Lorenz  and  Rossler  systems.  Signifi¬ 
cant  noise  reduction  is  measured  in  terms  of 
signal  to  noise  ratio,  as  documented  in  tables  1 
and  2.  Visually,  the  output  signal  appears  to 
recover  a  good  deal  of  the  dynamics  of  the 
original  signal,  as  shown  in  figs.  3  and  4.  For 
example,  we  recover  the  characteristic  increasing 
amplitude  of  the  Lorenz  oscillations  as  the  tra¬ 
jectory  spirals  out  from  the  unstable  periodic 
orbit  on  either  side. 

The  potential  utility  of  this  method  for  the 
computation  of  dynamical  invariants  from  noisy 
data  will  depend  on  the  success  with  which  the 
dynamics  can  be  reconstructed.  The  investigation 
of  this  aspect  will  be  the  subject  of  future  work. 
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The  problem  of  reconstruction  of  ordinary  differential  equations  from  numerical  scalar  time  series  is  discussed. 
Techniques  are  exemplified  for  Rossler  and  Lorenz  chaotic  attractors,  with  emphasis  on  improved  algorithms  with  respect 
to  those  previously  published.  The  steps  which  are  still  required  in  order  to  investigate  experimental  noisy  data  are 
discussed. 


1.  Introduction 

Invariants  characterizing  underlying  attractors 
(generalized  dimensions  and  entropies  K^, 
associated  singularity  spectra,  Lyapunov  expo¬ 
nents)  may  be  evaluated  from  numerical  scalar 
time  series  (see  60  pioneering  references  quoted 
in  ref.  [1],  and  ref.  [2]  for  local  evaluations). 
They  depend  on  control  parameters  and  may  be 
invariant  under  changes  of  coordinates  (but  see 
ref.  [3]).  The  interest  has  recently  drifted  toward 
the  characterization  of  topological  indices,  rely¬ 
ing  on  the  study  of  periodic  orbits  embedded  in 
the  attractor,  which  are  independent  of  changes 
of  coordinate  systems  but  also  of  changes  in  the 
control  parameters  [4]. 

We  also  own  algorithms  to  compute  such 
quantities.  For  instance,  generalized  dimensions 
may  be  obtained  from  numerical  scalar  time 
series  by  using  fixed-radius  or  fixed-mass  ap¬ 
proaches,  or  from  the  determination  of  unstable 
periodic  orbits  which  are  dense  in  the  attractor 
[5-13].  Many  of  the  evaluations  rely  on  a 
theorem  due  to  Mane  and  Takens  [14-16]  which 


is  a  modified  version  of  a  previously  stated  Whit¬ 
ney  theorem. 

When  the  algorithms  are  successful,  valuable 
information  may  be  obtained.  For  instance,  we 
may  have  evaluated  the  effective  number  of 
degrees  of  freedom  required  to  describe  the 
dynamics,  telling  us  how  many  ordinary  differen¬ 
tial  equations  are  needed  to  construct  a  pheno¬ 
menological  model  of  the  process.  Casdagli  how¬ 
ever  comments  that  the  evaluation  of  invariants 
is  of  little  practical  interest  and  that,  in  particu¬ 
lar,  no  idea  is  given  as  how  to  construct  the 
model  itself  [17].  He  then  discusses  the  inverse 
problem  of  map  construction,  given  a  sequence 
of  iterates,  and  addresses  the  issue  of  forecasting 
by  using  the  map  as  a  predictive  model. 

In  this  paper,  instead  of  solving  the  inverse 
problem  to  obtain  a  model  as  a  map,  the  aim  is 
to  reconstruct  a  vector  field.  Both  problems  are 
closely  related:  a  map  may  be  lifted  to  a  flow 
and,  conversely,  a  flow  integrated  with  a  numeri¬ 
cal  discrete  scheme,  is  reduced  to  a  map.  How¬ 
ever,  they  are  also  very  different  in  many  re¬ 
spects.  In  particular,  there  are  phenomena  oc- 
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curring  in  maps  which  are  absent  from  differen¬ 
tial  systems  [18, 19].  We  therefore  expect  that 
flow  models  may  be  safer  that  map  models. 
Furthermore,  in  so  far  as  the  mathematical  lan¬ 
guage  of  nature  is  closer  to  differential  equations 
than  to  discrete  iterations,  flow  models  might 
provide  us  with  more  accurate  insights  on  the 
actual  physical  processes  at  work. 

The  problem  of  vector  field  reconstruction  has 
been  discussed  for  the  Rossler  system  in  refs. 
[20-21]  and  for  the  Lorenz  system  in  refs.  [22- 
23].  This  paper  again  discusses  the  same  systems 
but  by  using  improved  algorithms  leading  to 
more  accurate  evaluations  of  reconstruction  con¬ 
stants,  possibly  by  orders  of  m.agnitude.  Many 
complementary  comments  and  discussions  given 
in  refs.  [20-23]  are  not  repeated  here. 

The  paper  is  organized  as  follows.  A  general 
mathematical  framework,  without  referring  to 
specific  systems,  is  discussed  in  section  2.  Recon¬ 
struction  techniques  and  evaluations  of  recon¬ 
struction  constants  for  the  Rossler  and  Lorenz 
systems  are  provided  in  section  3.  Qualitative 
and  quantitatives  validations  are  given  in  section 
4.  Conclusions  are  given  in  section  5. 


2.  The  general  formulation 

We  consider  a  dynamical  system  defined  by  a 
set  of  ordinary  differential  equations  (ODE’s): 

^-x=fix;n),  (1) 

in  which  x{t)  E  R"  is  a  vector  valued  function 
depending  on  a  parameter  t  called  the  time  and  /, 
the  so-called  vector  field,  is  a  ^-component 
smooth  function  generating  a  flow  <{»,  (see  ref. 
[24]).  is  the  parameter  vector  with  p 

components,  assumed  to  be  constant  in  this 
paper. 

System  (1)  is  called  the  original  system  (OS). 
The  OS  may  be  known,  for  instance  in  section  3 


where  we  shall  explicitly  consider  the  cases  of 
the  Rossler  and  Lorenz  systems.  It  may  also  be 
unknown,  for  instance  when  studying  experimen¬ 
tal  systems.  In  any  case,  it  is  assumed  that  the 
observer  numerically,  or  experimentally,  recor¬ 
ded  a  scalar  time  signal.  Without  any  loss  of 
generality,  the  recorded  variable  is  taken  as 
being  n^x,  i.e.  the  projection  of  x  on  the  first 
axis  of  R".  providing  a  sampled  scalar  time  series 

The  aim  is  thereafter  to  reconstruct  a  vector 
field  equivalent  (to  some  sense)  to  eq.  (1)  under 
the  form  of  a  standard  system  (SS)  defined  by 


^^^x  =  F,  , 

(2) 

II 

(3) 

1 

II 

•  1 

: 

(4) 

t-i  =  nrr,x,Y„.. 

(5) 

containing  n  ODE's,  the  SS  phase  space  being 
spanned  by  n  standard  coordinates  (7r,jt, 
Fj, .  .  .  ,  F„_,).  The  knowledge  of  the  number  of 
equations  to  be  introduced  in  the  SS  must  be 
independently  obtained  as  discussed  in  refs.  [20- 
23],  These  references  also  heuristically  com¬ 
mented  on  the  existence  of  SS’s  but  a  complete 
study  of  this  problem  has  been  postponed  to 
future  work. 

Starting  from  the  time  series  {7r,x},,  and  using 
an  efficient  enough  finite-difference  scheme,  a 
series  of  {n  -t-  l)-uplets  {v^x,  F, , .  .  .  ,  F„  ,, 
F„_,},  may  be  obtained.  We  are  then  left  with  a 
fitting  problem  to  evaluate  the  standard  function 
F  of  eq.  (5).  This  problem  may  be  considered  as 
a  problem  of  multivariate  modeling  of  data,  to 
be  solved  in  the  framework  of  the  theory  of 
approximations  of  functions  [25-27],  leading  to 
FiWfX,  F, , ,  .  .  ,  F„„,,  {/?,})  in  which  {/?,}  is  a 
set  of  model  parameters  called  the  reconstruc¬ 
tion  constants.  The  result  is  a  standard  recon- 
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structed  syst.  .  (SRS)  which  is  available  even 
when  the  OS  is  unknown. 

When  the  OS  is  known,  the  exact  standard 
unction  F  may  be  known,  leading  to  the  stan¬ 
dard  exact  system  (SES).  In  the  limit  of  perfect 
reconstruction,  any  SRS  must  identify  with  the 
SES.  Also,  we  may  know  the  direct  standard 
transformation  (DST)  expressing  the  standard 
coordinates  K, , .  .  .  ,  Y^^^)  versus  the 

original  coordinates  tt^x,  i  =  1,  .  .  .  ,  n  and  the  in¬ 
verse  standard  transformation  (1ST)  expressing 
the  original  coordinates  versus  the  standard 
coordinates.  Graphic  displays  of  the  SES-trajec- 
tories  may  therefore  be  obtained  by  (i)  integrat¬ 
ing  the  SES  vector  field  (2)-(5)  with  F  being  the 
exact  standard  function  or  (ii)  by  integrating  the 
OS  vector  field  and  applying  the  DST  to  the  OS 
trajectory. 

Next,  we  consider  inverse  standard  recon¬ 
structed  systems  (ISRS)  which  may  be  studied 
when  the  OS  is  known.  They  are  obtained  by 
starting  from  SRS’s  with  coordinates 
Ti , .  .  .  ,  y„  _ , )  and  by  using  the  DST  to  express 
SRS’s  in  terms  of  the  original  coordinates  tt^x. 
Furthermore,  we  demand  that  the  components 
2,  3, .  .  .  , «  of  the  OS  (eq.  (1))  be  exactly  satis¬ 
fied.  Therefore,  all  numerical  errors  associated 
with  reconstructions  are  reported  on  the  first 
equation  for  7r,i.  Consequently,  ISRS’s  take  the 
form: 

^T^x  =  F'{7^^x,...,'lT„x,{iRi}),  (6) 

TTiX  =  TT2fix-,  m)  ,  (7) 

TT„f{x-,  IX)  .  (8) 

When  the  /?,’s  are  given  their  exact  values,  i.e.  in 
the  limit  of  perfect  reconstruction,  ISRS’s  be¬ 
come  an  inverse  standard  exact  system  (ISES) 
which  simply  identifies  with  the  OS.  Other  kinds 
of  systems,  not  discussed  in  this  paper,  relevant 
to  the  case  when  the  OS  is  unknown,  are  dis¬ 
cussed  in  refs.  [20-23]. 


3.  The  Rossler  and  the  Lorenz  systems 

Improved  algorithms  for  the  determination  of 
the  standard  function  F,  leading  to  gains  of 
accuracy  by  several  orders  of  magnitude  with 
respect  to  techniques  previously  required,  are 
presented  in  this  section.  Attention  is  focussed 
on  OS’s  and  ISRS’s.  Beside  the  need  to  keep  this 
paper  under  a  reasonable  length,  our  choice  of 
focusing  the  attention  here  on  OS’s  and  ISRS’s  is 
essentially  due  to  the  fact  that  these  systems 
provide  the  most  direct  and  convincing  assess¬ 
ments  of  the  quality  of  reconstructions  because, 
in  the  limit  of  perfect  reconstructions,  ISRS’s 
identify  with  OS’s.  For  other  kinds  of  systems 
and  validations,  see  refs.  [20-23]. 

3.1.  Mathematical  expressions  of  OS's  and 
ISRS's 

The  Rossler  OS  reads 

i=-y-2,  (9) 

y  =  x  +  ay,  (10) 

z  =  b  +  z{x  -  c) ,  (ll) 

with  the  parameter  vector  p  =  (a,  b,  c)  taken 
equal  to  (0.398, 2,  4)  for  which  the  asymptotic 
motion  of  the  system  settles  down  on  to  a  chaotic 
attractor  [18]. 

The  Lorenz  OS  reads 

x  =  o-(y-x),  (12) 

y  =  Rx  -  y  -  xz  ,  (13) 

z  =  -  bz  +  xy  ,  (14) 

with  the  parameter  vector  p  =  (a,  R,  b)  taken 
equal  to  (10, 28,  8/3)  for  which  again  the  a'^ymp- 
totic  motion  settles  down  on  to  a  chaotic  attrac¬ 
tor  [18,  28]. 
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Both  systems  being  of  order  n  =  3,  standard 
systems  read 


jc=  y , 

(15) 

y=z. 

(16) 

singularities,  their  exact  nature  and  numerical 
consequences,  are  provided  in  refs.  [20-23], 
Both  exact  standard  functions  F  may  be  re¬ 
written  as  a  ratio  of  polynomial  expansions,  ac¬ 
cording  to 


Z  =  Fix,  Y,  Z) .  (17) 

For  the  Rossler  system,  the  exact  standard 
function  F  may  be  written  as 


z  = 


I +k+m=0 


E  d^.^x'y'^z 


>  +  A:  +  m  =  0 


(20) 


Z  =  ab  —  cx x^  -  axY  xZ  +  (ac  -  1 )  V 

+  (a  -  c)Z - — - (x  -P  b  -  aY  +  Z) . 

(18) 

For  the  Lorenz  system,  the  exact  standard 
function  F  may  be  written  as 

Z  =  baiR  -  \)x  -  bia  +  l)Y  -  (b  +  a  +  \)Z 
-  x^Y  -  x^a  +  j  [(<r  +  l)Y  +  Z] .  (19) 

In  both  cases,  we  note  that  these  functions 
exhibit  singular  coordinate  sets  of  Lebesgue 
measure  0,  {x^  =  a  +  c}  for  the  Rossler  case  and 
(x:,.  =  0}  for  the  Lorenz  case.  Discussion  of  these 


The  biggest  N,,  generating  non-zero  contribu¬ 
tions  is  called  the  degree  of  the  system.  Rossler 
and  Lorenz  systems  are  of  degree  3  and  4, 
respectively.  Furthermore,  in  the  denominator, 
one  of  the  L>y*„’s  must  be  assigned  a  constant 
value  (in  practice,  the  simplest  one:  1)  to  remove 
a  degeneracy  in  the  problem.  For  the  Rossler 
system,  we  choose  D„oo=L  For  the  Lorenz 
system,  we  have  no  choice  and  must  set  D^,,  =  1. 
The  exact  values  of  the  constants 
which  are  not  equal  to  0  are  given  in  tables  1  and 
2  for  the  Rossler  and  Lorenz  cases,  respectively, 
except  for  Dg^o  =  1  (Rossler)  and  D,,,,,  =  1 
(Lorenz). 

With  Z  given  by  eq.  (20),  the  ISRS  of  the 
Rossler  system  is  then  found  to  be 


Table  1 

Value  of  constants  Rossler  case.  P  =  -  \(a  +  c). 


Exact  values 

(I) 

(2) 

(3) 

N 

^’ooo 

ab  =  0.796 

0.795  9%  9 

0.7%  001  6 

0.7%  (HX)008 

N 

a6P-c  =  -4.180  991  360 

-4.180  978  7 

-4.181  0000 

-4.180-)91  3.38 

^’oio 

flc  -  1  +  6P  =  0.137  247  840 

0.137  243  8 

0.137  249  3 

0.1.37  247  808 

^001 

a  -  c  =  -3.602 

-3.601  987  3 

-3.602  009  6 

-3.601  999  999 

^200 

l-c/'=  1.909  504  320 

1.909  507  1 

1.909  508  4 

1.909.504  142 

^’020 

-a/’  =  0.090  495  680 

0.090494  8 

0.090  497  1 

0.090  495  675 

a(Pc-  1)  =  -0.759  982  719 

-0.759  9810 

-0.759  981  5 

-0.7.59  982.548 

A'.o, 

IH- P(a-c)=  1.819  008  640 

1.819007  1 

1.819015  0 

1.819(X)8  401 

Non 

P  =  -0.227  376  080 

-0.227  374  4 

-0.227  373  3 

-0.227.375  927 

-aP  =  0.090  495  680 

0.090  496  3 

0.090  494  9 

0.090  495  594 

Af2„, 

P=  -0.227  376080 

-0.227  376  4 

-0.227  376  7 

-0.227  375  980 

N 

P=  -0.227  376080 

-0.227  3764 

-0.227  376  3 

-0.227  376  018 

T^ino 

P=  -0.227  376080 

-0.227  376  4 

-0.227.374  3 

-0.227  376  028 
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Table  2 

Values  of  constants  S'n„.  Lorenz  case. 


Exact  values 

(1) 

(2) 

(3) 

^200 

ba(R-l)  =  720 

719.999  131 

720.000  215 

720.(XK)097 

^020 

a  +  1  =  11 

10.999  984 

10.999  999  981 

ll.lKWOOl 

-/)(<7  +  1)=  -29.(3) 

-29.333  174 

-29.333  399 

-29.333  350 

^,0, 

-(6 +  (7+1)  =  -13.(6) 

-13.666  652 

-13.666  656 

-13.666  667 

^0.. 

1 

0.999  997 

0.999  999  998 

1.000  (XK)  161 

^4(10 

1 

9 

II 

1 

o 

-9.999985 

-10.000002 

-10.(X10tK)l 

^3.0 

-1 

-1.000002 

-0.999  999  050 

-0.999  999  980 

be  +  x(a  -h  b) 

■+■  ya^  +  zc^  —  2xzc  +  x^z 

/  3 

+  (  2  bijkmx'(-y  -  zf 

'j  +  k  +  m  =  0 

X  (-b  -  X  -  ya  +  zc  -  xz)”^ 

/  3 

X  ( 1  +  2  Dj^„x\-y  -  2)* 

'  /+*+m=l 

X  (-  &  -  X  -  ya  +  zc  -  xz)"*  j  j  , 

(21) 

y  =  x  + ay  , 

(22) 

z  =  b  ■¥  z(x  -  c) . 

(23) 

For  the  Lorenz  case,  the  ISRS  reads 

+  x{xy  -  bz) 

+  (  i 

^j  +  k  +  m=0 

X  [x(R  +  O’)  -  y{cr  +  1)  -  xzp^ 
x(  2  D^,„a'^^"'x\y-xY 

'j  +  k  +  m=0 

x[x{R  +  a)-y{(T  +  \)-xzY'\D^^=^^  ], 

(24) 


z  =  -bz  +  xy.  (26) 

In  both  cases,  we  note  again  the  appearance  of 
singular  coordinate  sets  of  Lebesgue  measure  0. 
However,  when  constants  are  given 

their  exact  values,  these  ISRS’s  become  OS’s 
owning  no  singularity.  Therefore,  for  a  high- 
quality  reconstruction,  the  amount  of  residual 
parasitic  singularities  in  ISRS’s  is  very  small  with 
the  result  that  integrations  of  the  systems  do  not 
require  any  special  procedure  (see  refs.  [20-23]). 

3.2.  Finite-difference  schemes  to  evaluate 
derivatives 

In  practice,  the  OS’s  are  integrated  with  a 
fourth-order  Runge-Kutta  technique  and  a  con¬ 
stant  time  step  bt,  generating  numerical  scalar 
time  series  which  are  assumed  to  be  all 

our  knowledge  concerning  the  systems.  The  re¬ 
construction  of  standard  functions  F  may  be 
achieved  from  vectorial  time  series  {x,,  T,,  Z^, 
Z,},  each  quadruplet  at  time  step  i  containing  the 
value  x,  of  the  original  variable  and  three  succes¬ 
sive  derivatives  T,,  Z,,  Z,.  We  therefore  need 
efficient  finite-difference  schemes  to  evaluate  the 
derivatives. 

All  schemes  may  be  deduced  from  a  Taylor 
expansion  of  the  considered  variable  (say  x)  at 
the  considered  time  t: 

x(t  +  At)  =  x(t)  +  Atx(t)  -I-  5(A/)’x‘'’(r) 

+  ^  (Af) V" V)  +  e7(Ar" " ' ) .  (27) 

n\ 


y  =  Rx  -  y  -  xz  , 


(25) 
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Combining  (i)  x(t)  and  x(t  +  At)  or  (ii)  jf(r) 
and  ;c(/  — Ar),  we  obtain  downward  and  upward 
(non-centered)  first-order  schemes  reading  re¬ 
spectively: 


(28) 

(29) 


In  has  been  systematically  observed  that  non- 
centered  schemes  (such  as  the  above  first-order 
ones)  led  to  poorer  results  than  centered 
schemes,  as  it  could  be  expected  because  evalua¬ 
tions  of  derivatives  at  time  t  are  actually  shifted 
far  away  from  t,  near  (t  +  jAr)  in  (28)  and  near 
(r  -  2  At)  in  (29),  the  resulting  errors  being  fur¬ 
thermore  amplified  by  the  need  of  successive 


derivatives.  Centered  schemes  are  therefore 
superior.  Writing  (27)  for  x{t  +  Ar)  and  for  x(t  - 
At),  we  readily  obtain  the  simplest  centered 
scheme  which  is  of  order  2, 

(30) 

leading  to 

X,.  =  y,  =  (x,.^, -x,_,)/(280  , 

(31) 

Y,  =  Z,  =  (x,,,-2x,-bx,_,)/(280^ 

(32) 

Z,  =  (X;^3-3x;^,  -b3x,_,  -x,._3)/(280 

' .  (33) 

To  quantify  the  accuracy  of  these  schemes,  we 
consider  N  triplets  {V,,  Z,,  Z,}  =  {jr‘'',  x)"', 
of  successive  jc-derivatives  involved  in  the 
left-hand  sides  of  standard  systems  (15)-(17), 
evaluated  using  successively  the  upward  first- 
order  scheme  (29),  the  second-order  scheme 
(30)  and  the  fourth-order  scheme  (34),  for  the 
Rossler  system,  with  At  =  8r  =  10^^  and  N  =  10^. 
These  derivatives  may  also  have  been  evaluated 
independently  directly  from  the  OS  and  are  then 
noted  xj"’,  From  (9)-(ll),  we  have 

=  (37) 

■fp’  =  (-y  -z)i  =  -[x  +  ay  +  b  +  z(x  -  c)],  , 

(38) 

and  similarly  for  We  may  then  evaluate 
scheme  errors  ct'”’  in  which  n  designates  the 
order  of  the  derivative  according  to 

(39) 

Results  are  given  in  table  3.  We  observe  the 
deterioration  of  the  scheme  errors  when  the 
order  of  the  derivative  increases  and  its  improve¬ 
ment,  by  orders  of  magnitude,  when  the  order  of 
the  finite-difference  scheme  increases.  Obvious¬ 
ly,  still  higher-order  schemes  may  be  examined. 

3.3.  The  fitting  problem 


The  fourth-order  centered  scheme  is  also  dis¬ 
cussed.  This  scheme  is  obtained  from  (27)  by 
expressing  [x{t  2At)  -  x{t  -  2Ar)]  and  [x(r  + 
At)  -  x(<  -  At)],  leading  to 


When  vectorial  time  series  {x,,  Y^,  Z,,  Z,  }  are 
obtained,  we  are  then  left  with  the  problem  of 
determining  the  standard  function  F.  This  is 


8A,  -  A, 

,  (34) 

in  which: 

+  i .  (35) 

^2  =  x,_^.  (36) 


Table  .1 

Scheme  errors  evaluatefl  for  several  finite-difference 
schemes.  Rossler  system. 


Order  1 
(non-centered) 

Order  2 
(centered) 

Order  4 
(centered) 

tr'" 

1.5  X  10  ' 

9  X  10  ' 

2.5  X  10 

5.5  X  10  ’ 

5.5  X  10 

9.5  X  10  " 

2.5  X  10"' 

3.5  X  10  ' 

1  X  10  ' 
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essentially  a  multivariate  interpolation /extrapo¬ 
lation  or  better  approximation  problem.  See 
refs.  [25-27]  for  a  background,  and  ref.  [17]  for 
a  comprehensive  review.  When  global  tech¬ 
niques  are  used,  the  fitting  problem  must  both 
define  a  mathematical  structure  for  F  (a  model) 
and  then  numerically  evaluate  the  values  of  the 
reconstruction  constants.  Generally,  F  may  be 
expanded  on  a  complete  set  of  basis  functions  or 
polynomials,  and  expansion  coefficients  (the  re¬ 
construction  constants)  may  be  evaluated  by 
using  a  least-square  technique.  An  effective  ex¬ 
ample  using  Legendre  polynomials  is  provided 
by  Cremers  and  Hiibler  [29]  for  the  reconstruc¬ 
tion  of  a  limit  cycle  of  a  Van  der  Pol  oscillator. 
Another  popular  choice  in  the  theory  of  approxi¬ 
mation  of  functions  is  the  use  of  rational  func¬ 
tions  which  may  be  superior  to  polynomials  be¬ 
cause  they  are  able  to  model  functions  with  poles 
(see  ref.  [25],  ch.  3)  which  is  indeed  the  case  for 
the  exact  standard  functions  of  the  Rossler  and 
Lorenz  systems  (section  3.1).  Rational  functions 
(20)  are  therefore  used  in  this  paper. 

Once  the  decision  of  using  a  global  technique 
with  (20)  is  taken,  we  must  determine  the  degree 
A/q  which  is  in  principle  unknown.  Also,  to  re¬ 
move  a  degeneracy  in  the  problem,  one  of  the 
denominator  constants  must  be  assigned  a 
constant  value  that  we  may  choose  to  be  1.  In 
principle,  we  also  do  not  know  for  which  con- 
stant(s)  the  assignment  of  value  1  is  possible. 
Therefore,  a  general  procedure  may  be  to  at¬ 
tempt  fitting  with  successive  degrees  of  approxi¬ 
mation  Nf,  =  l,  2,...  and  simultaneously  at¬ 
tempting  to  set  D„„„  =  1,  then  D|„„  =  1, . .  . ,  up 
we  obtain  a  satisfactory  fitting.  For  the  Rossler 
system,  we  would  determine  =  3  and  find  that 
we  may  set  =  1.  For  the  Lorenz  system,  we 
would  determine  N„  =  4  and  find  that  the  only 
possible  choice  is  =  1.  These  facts  are  now 
considered  as  being  established. 

In  refs.  [20-23],  the  determination  of  the  re¬ 
construction  constants  was  performed  as  follows. 
We  first  rewrote  (20)  as  a  linear  equation.  For 
instance,  in  the  Lorenz  case,  we  obtained 


- =jrZ.  (40) 

containing  69  constants,  constant  D^,„,  being  ex¬ 
cluded.  When  69  quadruplets  {j:,,  V,,  Z,,  Z,}  are 
sampled,  (40)  provides  a  set  of  linear  equations 
which  may  be  solved  by  the  Cramer  technique  to 
obtain  numerical  values  of  reconstruction  con¬ 
stants  Actually,  for  the  sake  of  ac¬ 

curacy,  we  solve  a  number  of  linear  sets  and 
carry  out  averages  for  each  constant.  After  this 
step  is  carried  out,  we  may  identify  on  objective 
grounds  the  constants  which  are 

theoretically  equal  to  0,  because  their  values  are 
noise-dominated.  Dismissing  these  constants, 
(40)  simplifies  leading  us  to  a  similar  problem 
with  a  smaller  number  of  constants  which  are 
again  similarly  evaluated.  It  also  appears  that 
some  sets  lead  to  more  accurate  results  than 
others,  due  to  outliers.  Outliers  are  eliminated 
by  a  discrimination  procedure  allowing  for  a 
refinement  of  the  evaluations.  At  the  outcome  of 
such  an  algorithm,  we  obtain  the  results  given  in 
table  1,  column  (1)  for  the  Rossler  system,  and 
in  table  2,  column  (1)  for  the  Lorenz  system, 
with  a  second-order  finite-difference  scheme. 

The  accuracy  of  the  evaluations  for  constants 
which  are  theoretically  not  equal  to  0  may  be 
quantified  by  using  e,  the  absolute  value  of  the 
relative  difference  between  the  theoretical  value 
of  the  constant  and  its  reconstructed  value.  For 
constants  which  are  theoretically  equal  to  0,  we 
may  use  A,  the  absolute  difference  between  the 
theoretical  values  and  the  reconstructed  values 
(not  given  here),  i.e.  the  modulus  of  the  recon¬ 
structed  values  themselves.  For  the  Rossler  and 
Lorenz  systems,  the  average  e  per  constant  in 
columns  (1)  are  6x  10'^  and  2x  10  ^  respec¬ 
tively. 

In  this  paper,  we  investigate  the  use  of  a 
least-square  method  to  solve  the  fitting  problem 
and  also  the  influence  of  the  finite-difference 
schemes.  The  example  of  (40)  is  considered 
(Lorenz  case).  The  indeterminates 
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are  arranged  in  a  vector  A',(/  =  1, .  .  .  ,  NJ  made 
of  lines.  When  N  time  steps  are  recorded, 
(40)  forms  a  set  of  linear  equations.  The  co¬ 
efficients  of  the  left-hand  sides  are  nominals 
involving  variables  .v,  Y,  Z,  Z  and  form  a  matrix 
made  of  N  lines  and  columns  (/  = 
1, .  .  .  ,  y  =  1, .  .  . ,  NJ.  The  right-hand  sides 

form  a  vector  A,(/  =  1 . N)  made  of  N  lines. 

Knowing  /4,y  and  6,,  the  problem  is  to  find  X, 
satisfying  the  relation  (Einstein  notation  used): 

AqX^  =  b.,.  (41) 

Using  N>  N^,  the  problem  is  overdetermined. 
It  may  be  solved  by  means  of  a  generalized 
least-square  method  (ref.  [261,  PP-  197), 

researching  the  best  solution  accounting  for  the 
overdetermination.  We  define  a  residual  vector 

r,  =  b,-A,^X^.  (42) 

The  best  solution  is  the  one  which  minimizes 
the  Euclidean  norm  of  this  vector; 

=  (43) 

One  them  shows  that  this  solution  satisfies  a 
normal  set  of  equations  [26], 

B,A^,X,  =  ,  (44) 

in  which  5^  =  /ly,  is  the  transpose  of  matrix  A. 
Set  (44)  contains  equations  (/  =  1, . . .  ,  N^) 
and  unknowns  {k  =  \, .  .  .  ,  N^).  Furthermore, 
for  each  constant  A",,  the  quality  of  the  fit  is 
indicated  by  a  standard  deviation  ct,  which  is 
given  by  (Q,,)''^  in  which  is  the  ith  diag¬ 
onal  element  of  the  matrix  C,^  equal  to 

{BjjAii^y'.  The  computer  program  used  to  im¬ 

plement  this  method  is  Gauss)  (see  ref.  [25]) 
which  relies  on  a  Gauss-Jordan  method  to  solve 
(44)  and  invert  {B^^A^^). 

The  Rossler  and  Lorenz  systems  are  integrated 
with  a  fourth-order  Runge-Kutta  algorithm  to 


2()V 

generate  series  {.v,}  with  time  steps  hi  =  10  ’  and 
10  respectively  (the  pseudo-periods  for  these 
systems  are  6.22  and  0.73,  respectively).  Succes¬ 
sive  derivatives  are  evaluated  by  second-order 
and  fourth-order  centered  schemes.  The  least- 
square  algorithm  is  run  with  N  =  10'  time  steps, 
with  a  sampling  time  of  10  '  in  both  cases. 

Results  obtained  with  second-order  finite  dif¬ 
ferences  are  given  in  tables  1  and  2,  columns  (2). 
The  average  e's  per  constant  are  6  x  10  ''  and 
6x  10  and  the  average  J's  per  constant  are 
3  X  10  ^  and  9  x  10  respectively.  The  im¬ 
provement  for  the  e's  with  respect  to  columns 
(1)  is  not  very  impressive  but  it  must  be  re¬ 
membered  that,  for  these  columns,  constants 
theoretically  equal  to  zero  were  dismissed,  while 
all  constants  are  kept  for  the  least-square  meth¬ 
od.  If  we  compare  the  least-square  technique 
(columns  (2))  and  the  Cramer  technique  with  all 
constants  included  in  the  problem  (refs.  [20- 
23]),  then  the  gain  for  e's  are  by  two  up  to  nearly 
three  orders  of  magnitude,  and  the  gain  for  J's 
are  by  nearly  one  up  to  two  orders  of  magnitude, 
for  the  Lorenz  and  Rossler  systems,  respectively. 
It  is  of  interest  to  point  out  that  the  gain  of 
accuracy  is  bigger  for  the  Rossler  system  than  for 
the  Lorenz  one,  in  correlation  with  the  fact  that 
we  previously  found  that  accurate  reconstruc¬ 
tions  were  more  difficult  for  the  Rossler  case. 

Results  for  fourth-order  finite  differences  are 
given  in  tables  1  and  2.  columns  (3).  When 
compared  with  the  results  for  second-order  finite 
differences,  we  find  that  average  e  and  J  are 
more  accurate  by  one  order  of  magnitude  for  the 
Rossler  system.  Conversely,  improvements  are 
again,  similarly  as  commented  above,  less  spec¬ 
tacular  for  the  Lorenz  system.  The  gain  for 
average  e  is  only  by  a  factor  3  while  we  even 
have  a  small  deterioration  of  accuracy  by  a  factor 
2  for  the  average  A.  Still,  we  may  conclude  that 
the  use  of  the  fourth-order  scheme  is  more  effi¬ 
cient  than  the  use  of  the  second  order  scheme. 

Still  improved  results  could  be  obtained  by 
combining  Cramer  and  least-squares  techniques. 
Noise-dominated  constants  can  effectively  be 
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easily  identified  by  the  Cramer  technique  (refs. 
[20-23]).  Then,  dismissing  them,  we  would  be 
left  with  simpler  least-squares  problems  than  the 
ones  we  solved,  leading  expectedly  to  more 
accurate  results.  However,  section  4  will  show 
that  qualitative  and  quantitative  validations  of 
the  quality  of  our  reconstructions  are  very  satis¬ 
factory.  Therefore,  the  interest  of  further  im¬ 
provements  is  not  warranted. 

4.  Result  validations 

We  start  with  qualitative  validations  relying  on 
the  visual  comparison  between  attractors  pro¬ 
duced  by  the  OS’s  and  ISRS’s.  All  displayed 
trajectories  lasted  for  100  pseudo-periods.  Fig.  1 
shows  the  Rossler  OS  obtained  by  integrating 
the  vector  field  of  (9)-(ll).  Fig.  2  shows  the 
Rossler  ISRS  obtained  by  integrating  the  vector 
field  of  (21)-(23).  Values  of  reconstruction  con¬ 
stants  are  taken  from  table  1,  column  (3),  and 
the  reconstructed  values  of  constants 
which  are  theoretically  equal  to  0  (not  given  to 
avoid  marginal  data  proliferation)  are  also  in¬ 
cluded  in  the  vector  field.  Both  figures  compare 
very  favourably.  Sensitivity  to  initial  conditions 
alone  is  sufficient  to  prevent  figs.  1  and  2  from 


Fig.  I.  Rossler  case;  original  system. 


Fig.  2.  Rossler  case:  inverse  standard  reconstructed  system. 

being  identical.  For  the  Lorenz  system,  we  may 
similarly  compare  the  OS  ((12)-(14),  fig.  3)  and 
the  ISRS  ((24)-(26),  with  from  table  2, 
column  (3),  including  again  the  other  constants 
not  given,  fig.  4).  The  comparison  is  exception- 
nally  good. 

For  quantitative  validations,  we  shall  again 
compare  generalized  dimensions  evaluated  in 
with  a  fixed-radius  approach  as  in  refs.  [20-23]. 
Details  on  algorithms  used  and  notations  are  in 
particular  available  from  refs.  [1,20],  in  which 


8? 


Fig.  3.  Lorenz  case;  original  system. 
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Fig.  4.  Lorenz  case:  inverse  standard  reconstructed  system. 


the  reader  may  find  extensive  quotations  con¬ 
cerning  the  pioneering  literature.  For  both  Ros- 
sler  and  Lorenz  systems,  trajectories  are  com¬ 
puted  with  a  fourth-order  Runge-Kutta  al¬ 
gorithm  with  time  step  ht=  10 About  62  vec¬ 
tors  and  73  vectors  were  sampled  per  pseudo¬ 
period  for  the  Rossler  and  the  Lorenz  systems 
respectively.  The  total  duration  of  the  trajec¬ 
tories  were  about  30000  in  both  cases.  Local 
slopes  D^{r^)  are  evaluated  at  45  relocations 
separated  by  equal  logarithmic  intervals  on  a 
range  (r,,r2).  The  ranges  (r,,r2)  were 
(0.02, 0.8)  and  (0.2, 4)  for  the  Rossler  and 
Lorenz  systems  respectively.  -computations 

are  performed  for  q  E  [-45,  +45).  O^-results  are 
obtained  by  averaging  local  slopes  D^(r^)  on  a 
range  (r„,i„,  for  which  we  obtain  a  plateau 
of  good  quality.  Ranges  were 

(0.083,0.345)  and  (0.368, 1.440)  for  the  Rossler 
and  Lorenz  systems,  respectively.  These  ranges 
could  have  been  dramatically  increased  for  some 
q’s  (in  particular  q  =  2)  but,  as  they  have  been 
chosen,  they  were  valid  for  all  q’s.  This  is  in 
contrast  for  instance  with  previous  computations 
reported  in  ref.  [20]  in  which  range 
was  changed  with  q.  This  is  in  part  due  to  the 
fact  that,  by  using  a  more  powerful  computer, 
we  have  been  able  to  afford  computations  with 


big  resolutions  (N,  m)  =  (2  x  10'’,  10^)  in  which 
A/  is  the  size  of  the  temporal  sequence,  i.e.  the 
number  of  sampled  vectors,  and  m  the  number 
of  central  vectors  used  to  average  local  correla¬ 
tion  moments.  For  a  given  system,  the  OS  and 
the  ISRS  are  studied  under  exactly  the  same 
specifications.  Therefore,  comparisons  between 
results  make  sense  even  if  the  D, ^-values  them¬ 
selves  are  biased  due  to  algorithm  shortcomings. 
Finally,  we  also  evaluate  a,y{q),  the  standard 
mean  value  over  the  D^(r,)’s  in  the  plateau 
('■min'  ''max)-  These  rr^’s  therefore  provide  a 
quantity  evaluating  the  quality  of  the  plateau. 
They  also  provide  a  clue  for  the  accuracy  of  the 
D^-evaluations,  •vlthough  it  is  a  poor  one. 

For  the  Rossler  system,  some  results  are 
provided  in  table  4  and  all  resuls  are  given  in 
figs.  5  and  6.  is  the  relative  difference  in 
percent  between  D^(OS)  and  D^(ISRS)  with 
respect  to  D^(OS).  €„%  is  an  accuracy  indicator 
equal  to  50  [(Tp{q,  OS)/ D  ^{OS)  +  a, ^iq, 
ISRS)/D^(ISRS)1,  i.e.  to  the  arithmetic  average 
over  OS  and  ISRS  of  (Tp{q)ID^  in  percent.  In 
interpreting  these  data,  we  must  remember  that 
q-  \  is  a  pivot  value.  For  ^>1,  D^'s  probe 
parts  of  the  attractor  where  the  measure  is  most 
concentrated  leading  to  good  statistics  and  accur¬ 
ate  results  while,  conversely,  for  ^<1,  D^'s 
probe  parts  of  the  attractor  where  the  measure  is 
most  rarefied  leading  to  poor  statistics  and  inac¬ 
curate  results.  However,  even  then,  fig.  5  ex¬ 
pressing  D^'s  versus  q  shows  a  very  good  agree¬ 
ment  between  the  OS  and  the  ISRS  on  the  whole 


Table  4 

Some  exemplifying  results  for  OS  and  ISRS  generalized 
dimen.sions  D^.  Rossler  system. 


(OS) 

O,  - 

(ISRS) 

€//r 

-30 

2.16  ±0.16 

2.14  ±0.14 

1.3 

7.0 

-10 

2.11  ±0.09 

2.09  ±  0.08 

0.9 

4.0 

0 

1.9831  ±0.017 

1 .9493  ±  0.024 

0.6 

l.l 

I 

1.9130  ±0.009 

1,9209  ±0.017 

0.4 

0.7 

2 

1.8939  ±0.005 

1 .8982  ±  0.(X)9 

0.2 

0.4 

10 

1.8198  ±0.083 

1.8179  ±0.086 

0.1 

4.6 

30 

1.7695  ±0,173 

1.7687  ±0.179 

0.04 

9.9 
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Fig.  5.  Rdssler  case:  comparison  between  generalized  dimensions  of  OS  and  ISRS. 


range  of  studied  q's.  In  this  figure,  we  first 
plotted  OS  results  and,  afterward,  ISRS  results, 
dismissing  data  when  ISRS  results  are  not  dis¬ 
tinguishable  from  OS  data.  For  q>  -6,  OS  and 
ISRS  results  cannot  be  distinguished  in  most 
cases  and  the  agreement  is  therefore  nearly  per¬ 
fect.  Even  for  q< -6,  the  relative  differences 
are  very  small,  being  only  1.1%  for  q  = 
—45.  This  figure  should  be  compared  with  fig.  5 
in  ref.  [20]  showing  an  impressive  improvement. 


We  note  that  underestimations  of  D^'s,  for  q  ^0, 
in  fig.  5,  ref.  [20],  still  exist  in  the  present  fig.  5 
of  this  paper,  but  have  been  significantly  re¬ 
duced.  Since  this  statement  holds  for  both  OS 
and  ISRS,  we  attribute  this  decrease  of  under¬ 
estimations  to  the  increase  of  the  resolution 
(N,m).  Fig.  6  compares  e^%  and  e„%.  The 
smallest  value  of  €„%  is  obtained  for  ^  =  2  (cor¬ 
relation  dimension)  which  is  indeed  reputed  to 
be  the  simplest  order  to  study,  is  always 


Fig.  6.  Rosslcr  case:  comparison  between  an  accuracy  criterion  for  comparison  between  OS  and  ISRS.  and  e„  a  criterion  for 
plateau  quality. 
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smaller  than  e„,  and  even  much  smaller,  except 
for  a  q  near  the  pivot  value  1  where  the  slope 
dD^ldq  is  large.  Note  also,  as  expected,  that  e^’s 
are  much  smaller  for  positive  q's  than  for  nega¬ 
tive  q's.  From  this  discussion,  we  conclude  that 
the  -comparisons  between  OS  and  ISRS  are 
very  satisfactory.  In  refs.  [21,22]  a  more  sophis¬ 
ticated  approach  to  discuss  the  quality  of  com¬ 
parisons  is  provided  as  a  result  of  the  fact  that 
is  indeed  a  poor  statistical  criterion  of  accuracy. 
This  approach  relies  on  making  averages  on 
many  systems  to  obtain  a  more  realistic  standard 
deviation  cr^’,  typically  smaller  than  o-p,  there¬ 
fore  leading  to  more  severe  discussions.  The 
qualitative  statement  that  agreements  are  satis¬ 
factory  was  not  modified  by  this  other  quantita¬ 
tive  approach  to  evaluate  the  quality  of  e^- 
values. 

For  the  Lorenz  system,  results  are  similarly 
given  in  table  5  and  figs.  7  and  8.  Most  com¬ 
ments  would  be  similar  and  would  again  lead  us 
to  the  conclusion  that  the  agreement  between 
OS  and  ISRS  is  satisfactory.  However,  some 
specific  comments  are  required,  linked  to  the 
fact  that  D, -computations  for  the  Lorenz  system 
are  more  difficult  than  for  the  Rossler  case,  as 


Table  5 

Some  exemplifying  results  for  OS  and  ISRS  generalized 
dimensions  D^.  Lorenz  system. 


9 

-  "i, 

(OS) 

(ISRS) 

t/f 

-30 

2.9  ±1,1 

2.6  ±0.9 

12 

37 

-10 

2,9  ±  0,9 

2.6  ±  0.8 

12 

31 

0 

2.0770  ±  0.025 

2.0699  ±  0.025 

0.3 

1.2 

1 

2.0479  ±  0.(K)2 

2.0490  ±  ().(K)1 

0.05 

0,07 

2.0690  ±  0.1X12 

2,0697  ±0.(K)2 

0.03 

0.1 

)0 

2.17.30  ±  0.013 

2.1741  ±  0.014 

0.05 

0.6 

30 

2. 19.33  ±0.020 

2. 19.56  ±0.019 

0.1 

0.9 

we  repeatedly  observed  in  previous  works.  In  fig. 
7  agreement  seems  perfect  for  q^O  but,  for 
^<0,  e^’s  are  much  bigger  than  in  the  Rossler 
case.  They  are  however  still  much  smaller  than 
6„’s  (fig.  8)  but  these  e„’s  are  also  much  bigger 
than  for  the  Rossler  system.  Actually,  all  e^'s  are 
smaller  or  much  smaller  than  e^’s  (fig.  8)  but  the 
difference  of  behaviour  between  the  cases  q  >l 
and  q  <  1  forced  us  to  use  two  different  ordinate 
scales  in  fig.  8.  Furthermore,  computed  D^'s 
increase  when  q  increases  for  ty  £  [1, 25).  This  is 
an  artifact  because  D^’s  must  theoretically  satisfy 
the  relation  q>  p.  Hopefully,  compu¬ 

tations  with  still  bigger  resolutions  (A,  m)  would 


Fig.  7.  Lorenz  case:  comparison  between  generalized  dimensions  of  OS  and  ISRS. 
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simultaneously  imr  ^  the  results  for  small  q's 
and  remove  the  artituct,  but  such  bigger  resolu¬ 
tions  could  be  difficult  to  afford  in  terms  of 
CPU-requiicments.  However,  we  mention  that 
D^-con.putations  of  the  OS  once  performed  with 
(N,  m)  =  (10*,  10*)  and  a  time-delay  reconstruc- 
tioti  technique  with  variable  2  did  not  exhibit  the 
artifact.  The  reader  should  refer  to  ref.  [23]  for 
more  details  on  these  shortcomings. 

5.  Conclusions 

This  paper  presented  our  state  of  the  art  in  our 
effort  to  promote  automatic  reconstruction  of 
phenomenological  models  from  numerical  scalar 
time  series,  exemplifying  results  on  two 
paradigms,  namely  the  Rossler  and  the  Lorenz 
systems.  Beside  our  main  motivation,  the  dis¬ 
cussed  techniques  open  the  way  to  several  other 
lines  of  research  like  flow  forecasting  and  others 
reported  in  previous  references.  Our  final  moti¬ 
vation  is,  however,  to  extend  and  generalize  the 
present  work  up  to  the  study  of  experimental 
noisy  systems.  One  problem  is  to  provide  a 
systematic  study  of  techniques  available  to  model 
the  standard  function  F.  To  assess  the  degree  of 


generality  of  such  techniques,  we  intend  to  inves¬ 
tigate  the  example  of  chaotic  attractors  produced 
by  a  model  of  thermal  lens  oscillations  [Ij.  These 
attractors  are  generated  by  a  rather  exotic  vector 
field  which  will  provide  an  acid  test.  The  second 
problem  concerns  the  presence  of  noise  on  real 
data.  This  addresses  the  issues  of  the  sensitivity 
to  noise  of  the  techniques  we  used,  and  of  noise 
smoothing  and  removal.  We  have  little  doubt 
that  these  problems  will  eventually  find  adequate 
solutions,  providing  the  applied  scientists  with 
new  tools  of  interest  for  data  and  system  mod¬ 
elling. 
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Since  the  pioneering  work  of  Packard  et  al.  and  Takens  it  has  become  customary  to  reconstruct  the  topology  of 
attractors  in  phase  space  from  a  time  series  of  one-dimensional  experimental  observations  by  using  delay  coordinates. 
Many  practical  refinements  of  the  original  methods  have  been  developed. 

Many  experimental  systems  possess  symmetry,  and  bifurcations  can  cause  changes  in  the  symmetry  of  observed  states. 
These  changes  are  quite  subtle  when  the  dynamics  is  chaotic.  It  is  therefore  important  to  reconstruct  not  just  the  topology 
of  the  attractor,  but  its  symmetry.  We  indicate  how  this  can  be  done  by  extending  the  Packard-Takens  approach  to  a  single 
equivariant  observation,  taking  values  not  in  the  real  numbers  M  but  in  a  linear  representation  V  of  the  symmetry  group  G. 
In  effect  a  single  set  of  symmetrically  related  observations  is  required.  Our  central  point  is  that  not  all  plausible  choices  for 
such  a  set  can  generate  embeddings.  In  order  for  the  method  to  produce  an  embedding,  it  is  necessary  that  V  should  be 
"sufficiently  complicated”.  More  precisely,  the  phase  space  M  must  be  subordinate  to  K  in  a  sense  introduced  by 
Wassermann.  This  concept  is  technical,  but  unavoidable  in  this  context,  and  it  greatly  clarifies  the  issue  of  embeddability. 
Using  it,  we  state  a  symmetric  version  of  the  Takens  embedding  theorem,  and  sketch  the  proof.  We  also  discuss  the  issue 
of  “setwise”  versus  ‘pointwise’  symmetry  of  an  attractor,  and  relate  this  to  the  transition  from  spatial  order  to  spatial 
disorder  in  temporally  chaotic  systems. 


1.  Introduction 

The  problem  of  detecting  deterministic  chaos 
in  experimental  data,  and  distinguishing  it  from 
random  noise,  has  stimulated  the  development 
of  new  methods  for  analysing  time  series.  The 
first  of  these  was  the  delay  coordinate  method  of 
Packard  et  al.  [21]  and  Takens  [29],  which  per¬ 
mits  the  topology  of  phase  space,  dynamics,  and 
attractors  to  be  reconstructed  from  a  time  series 
of  a  single  “generic”  observation.  This  method  is 
justified  by  the  Takens  embedding  theorem,  see 
ref.  [29]  theorem  1.  It  has  been  refined  by 
Broomhead  and  King  [3, 4]  who  use  principal 
component  analysis  to  overcome  certain  practical 
difficulties  in  its  implementation.  Many  other 
variants  now  exist,  not  all  of  which  have  been 
given  rigorous  justification. 

Many  systems  of  experimental  interest  possess 


symmetry.  Fluid  flows,  combustion,  and  convec¬ 
tion  often  take  place  within  symmetric  con¬ 
tainers  -  cylinders,  spheres,  rectangular  boxes, 
annuli.  Conventional  nonlinear  dynamics  focuses 
on  “generic”  behaviour,  but  symmetry  almost  by 
definition  is  “nongeneric"  in  the  conventional 
sense.  For  example,  one  of  the  generic  hypoth¬ 
eses  of  the  Takens  embedding  theorem  is  that 
the  flow  should  have  simple  eigenvalues  at  low- 
period  periodic  points.  However,  in  symmetric 
systems  eigenspaces  are  invariant  under  the  sym¬ 
metry  group,  and  this  can  force  multiple  eigen¬ 
values  to  occur,  because  -  with  very  few  excep¬ 
tions  -  irreducible  representations  are  multidi¬ 
mensional. 

It  is  therefore  necessary  to  modify  the 
theoretical  approach,  considering  systems  that 
are  ‘generic  subject  to  possessing  the  given  sym¬ 
metry’.  Dynamics  in  the  presence  of  symmetry. 
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or  equivariant  dynamics,  has  now  become  a  well 
defined  sub-area  of  dynamical  systems  theory. 
For  example,  Golubitsky  et  al.  [14]  have  given 
an  extensive  treatment  of  non-chaotic  bifurca¬ 
tions.  Chaos  in  symmetric  systems  has  been  dis¬ 
cussed  by  many  authors  including  Chossat  and 
Golubitsky  [7,8],  Field  and  Golubitsky  [11,  12], 
and  King  and  Stewart  [17],  Several  new  phenom¬ 
ena  occur,  notably  symmetry-increasing  crises 
(“collisions”  of  symmetrically  related  attractors), 
and  intermittency  just  after  such  a  crisis.  The 
dynamics  of  crises  have  been  studied  by  Grebogi 
et  al.  [15]. 

There  is  evidence  that  these  symmetry-related 
phenomena  occur  in  experimental  systems.  As- 
hwin  [1]  has  detected  a  symmetry-increasing 
crisis  in  a  coupled  oscillator  circuit;  and  Golubit¬ 
sky  [13]  has  argued  that  patterned  turbulence  in 
Couette-Taylor  flow  (such  as  turbulent  Taylor 
vortices)  can  usefully  be  interpreted  as  symmet¬ 
ric  chaos.  Further  experimental  examples  are 
surveyed  in  King  and  Stewart  [17]. 

In  equivariant  dynamics,  the  emphasis  is  not 
just  on  the  topology  of  attractors,  but  also  on 
their  symmetries.  Changes  in  symmetry  are  often 
very  robust,  and  provide  excellent  opportunities 
for  new  experiments  on  current  theories  of  non¬ 
linear  dynamics  and  chaos.  What  is  lacking  are 
practical  techniques  for  phase  space  reconstruc¬ 
tion  that  also  preserve  symmetry,  and  theoretical 
foundations  for  such  techniques.  In  this  paper  we 
take  a  step  towards  remedying  this  deficiency  by 
developing  an  equivariant  version  of  one  stan¬ 
dard  method  for  reconstructing  both  the  topolo¬ 
gy  and  the  symmetry  of  attractors  from  time 
series. 

The  underlying  idea  is  a  simple  one:  to  use  not 
just  a  single  numerical  measurement,  or  observa¬ 
tion,  but  a  multidimensional  set  of  measurements 
that  respects  the  symmetry  -  an  equivariant  ob¬ 
servation.  This  is  both  experimentally  and 
mathematically  natural,  but  by  making  it  explicit 
we  are  able  to  expose,  and  to  some  extent  deal 
with,  some  basic  issues  that  are  peculiar  to  the 
equivariant  case.  These  involve  group-theoretic 


technicalities:  the  point  we  wish  to  emphasise  is 
that  these  teehnicalities  mi  be  taken  into  ac¬ 
count  in  order  to  obtain  a  geiieral  understanding 
of  the  nature  of  the  problem. 

The  fundamental  theoretical  result  that  under¬ 
pins  phase  space  reconstruction  methods  is  the 
Takens  embedding  theorem  [29].  Correspond¬ 
ingly  our  central  theoretical  result  is  an 
equivariant  version  of  that  theorem  (theorem  2 
of  section  6  below).  Its  proof  involves  technical 
considerations  in  differential  topology,  and  in 
this  paper  we  outline  only  the  main  points. 
Again  we  shall  argue  that  such  considerations 
are  not  just  technical:  they  place  restrictions 
upon  the  target  of  a  suitable  experimental  obser¬ 
vation,  and  so  have  a  definite  bearing  upon  the 
design  of  experiments  to  detect  symmetric  chaos. 

2.  Experimental  motivation 

Among  the  phenomena  observed  in  symmetric 
systems  are  “coherent  structures":  dynamical 
states  that  combine  local  chaos  with  global  pat¬ 
tern.  The  classic  example  is  the  formation  of 
turbulent  Taylor  vortices  in  Couette-Taylor 
flow,  see  [10]. 

The  most  interesting  bifurcation  for  our  pres¬ 
ent  purposes  is  that  to  turbulent  Taylor  vortices. 
The  flow  is  turbulent,  with  no  genuine  symmet¬ 
ries;  but  it  possesses  the  symmetry  of  Taylor 
vortices  “on  the  average”.  That  is,  ignoring  the 
fine  texture  of  the  turbulence,  the  flow  has  he 
same  symmetry  as  Taylor  vortices,  namely,  a 
reflectional  symmetry  in  the  plane  of  the  vortex 
boundary,  together  with  discrete  translational 
symmetries  along  the  axis  (in  an  infinite  cylinder 
model).  Golubitsky  [13]  traces  this  sudden  reap¬ 
pearance  of  pattern  to  a  symmetry-increasing 
crisis  of  conjugate  chaotic  attractors:  see  also 
[17].  The  evidence  in  favour  of  this  explanation 
is  circumstantial  but  quite  strong.  Similar  ideas 
apply  to  some  other  coherent  structures  such  as 
spiral  turbulence. 

Another  system  rich  in  symmetry  is  an  elec- 
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tronic  circuit  formed  from  n  symmetrically  cou¬ 
pled  identical  oscillators.  Ashwin,  King,  and 
Swift  [2]  study  systems  of  this  type  when  the 
coupling  is  fully  symmetric.  The  symmetry  group 
is  then  S„,  the  symmetric  group  of  degree  n, 
acting  as  permutations  of  the  oscillators.  The 
mai.-'  focus  of  [2]  is  non-chaotic  dynamics.  As¬ 
hwin  [1]  studies  chaos  in  such  systems  when 
n  =  3  or  4.  For  experiments  he  uses  a  system  of 
coupled  identical  Van  der  Pol  oscillators  forced 
by  a  sinusoidal  signal.  The  results  are  visualised 
by  assigning  one  of  n  unit  vectors,  arranged  in 
the  plane  at  angles  of  It:  In,  to  the  voltage  in 
each  oscillator,  and  plotting  a  Poincare  section 
synchronised  with  the  times  at  which  the  forcing 
signal  generator  passes  through  zero  from  nega¬ 
tive  to  positive.  This  projection  of  the  voltage 
data  is  chosen  to  preserve  symmetry,  a  theme  we 
take  up  more  generally  in  section  5  below. 

In  both  of  these  experimental  systems  we  ob¬ 
serve  the  influence  of  symmetry  of  the  apparatus 
upon  the  dynamics,  leading  to  pattern-formation 
via  symmetry-breaking.  Moreover,  once  the 
dynamics  has  become  chaotic,  we  see  bifurca¬ 
tions  that  increase  the  symmetry  in  some  time- 
averaged  sense,  leading  to  certain  kinds  of  co¬ 
herent  structure. 


3.  E^uivariant  dynamics 

We  now  describe  a  theoretical  framework  in 
which  such  effects  can  be  described  and  ana¬ 
lysed.  We  assume  some  familiarity  with  “tradi¬ 
tional”  dynamical  systems  theory,  see  for  exam¬ 
ple  [16].  In  order  to  discuss  examples  from  a 
general  viewpoint,  we  begin  by  introducing  some 
basic  terminology  from  equivariant  dynamics. 
For  simplicity  we  concentrate  on  discrete 
dynamical  systems,  although  similar  considera¬ 
tions  apply  to  the  continuous  case. 

A  discrete  dynamical  system  is  given  by  an 
equation  of  the  form 

(1) 


Here  /;  A/  x  IR'^— >■  A/  is  a  mapping  (or  per¬ 
haps  C*  for  some  k)  defined  on  a  manifold  M, 
the  phase  space,  and  A  G  is  an  r-dimensional 
bifurcation  parameter.  Usually  either  r  =  0  (no 
bifurcation  parameter)  or  r  =  1 . 

Technically,  it  is  normal  to  assume  that  / 
should  be  a  diffeomorphism  (invertible  smooth 
map  with  smooth  inverse).  This  allows  the 
dynamics  to  be  continued  for  negative  time. 
However,  the  dynamics  of  non-invertible  maps  / 
can  be  approximated  by  projections  of  those  of 
invertible  maps  on  spaces  of  double  the  dimen¬ 
sion,  as  proved  in  the  shadow  lift  Lemma  of  King 
and  Stewart  [17].  We  therefore  permit  non- 
invertible  maps  in  our  examples  and  numerical 
experiments. 

Similar  considerations  apply  to  continuous 
system.^ 

djc 

^=/(^.A),  (2) 

in  which  case  /  is  a  vector  field  on  M. 

Some  important  practical  issues  in  time  series 
analysis  are  different  for  continuous  systems 
compared  to  discrete  ones,  but  both  have  similar 
theoretical  foundations. 

A  subset  A  of  M  is  said  to  be  an  attractor  for 
(l)if 

(a)  /(A)  =  A. 

(b)  A  has  a  dense  orbit. 

(c)  There  exists  a  neighbourhood  U  of  A  such 
that  for  every  uEU  we  have  w(m)C  A,  where 
w{u)  is  the  £u-limit  set  of  u. 

A  similar  definition  holds  for  the  continuous 
case. 

Suppose  that  a  compact  Lie  group  G  acts 
smoothly  on  M,  so  that  M  becomes  a  G- 
manifold.  Let  ym  denote  the  image  of  m  G  Af 
under  the  action  of  y  EG.  We  say  that  /  is 
G-equivariant  if 

f{ym)  =  yf{m) 

for  all  m  G  A/,  y  G  G.  (For  more  general  maps  f 
between  possibly  distinct  spaces,  the  actions  of  G 
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on  source  and  target  may  differ.)  Make  the 
following  definitions: 

(a)  The  G-orbit  of  m  £  A/  is 

Gm  =  {ym\y  £  G}  . 

For  equivariant  /,  if  A  is  an  attractor  for  the 
dynamics  then  so  is  yA  for  all  -y  £  G.  We  say 
that  yA  is  conjugate  to  A. 

It  is  important  to  distinguish  the  G-orbit  of  m 
from  its  dynamical  orbit,  or  trajectory.  We  de¬ 
note  the  latter  by  0{m). 

(b)  The  isotropy  subgroup  of  m  £  A/  is 

X„  =  {y  EG\ym  =  m}  . 

This  measures  the  degree  of  symmetry  of  a  point 
m. 

(c)  We  also  require  a  generalisation.  If  m  £  A/ 
we  define  the  (orbital)  symmetry  group  of  m  to 
be 

=  {r  e  G|yO(m)  =  0(m))  . 

Here  bars  denote  topological  closure  in  M. 

If  A  is  an  attractor  having  a  dense  orbit  f'(m), 
then  the  symmetry  group  of  A,  written  A^,  is 
defined  to  equal  A„.  It  is  the  set  of  -y  £  G  that 
leave  A  setwise  fixed  (but  not  necessarily  point- 
wise  fixed). 

(d)  If  .S  is  a  subgroup  of  G  (which  henceforth 
we  write  as  2  ^  G)  then  the  fixed-point  set  of  S 
is 

Fix(.S)  =  {/n  £  A/|o-/n  =  m  for  all  o- £  .S }  . 

We  write  this  as  Fix*^(.2)  when  several  distinct 
G-manifolds  are  under  consideration  and  it  is 
necessary  to  identify  which  one  is  intended. 

(e)  The  orbit  space  MIG  is  the  set  of  G-orbits 
of  M,  with  the  quotient  topology  defined  by  the 
natural  map  M—*MIG,  which  sends  x£Af  to 
the  orbit  Gx.  In  general  Af/G  is  not  a  manifold: 
it  may  possess  singularities.  Intuitively  MIG  is 


what  remains  of  M  when  we  identify  points  that 
map  to  each  other  under  the  G -action,  that  is,  if 
we  “factor  out  the  symmetry”. 

4.  Symmetry-increasing  crises 

We  briefly  describe  some  examples  of  crises  in 
symmetric  systems.  Their  occurrence  emphasises 
the  need  to  understand  the  symmetry  of  an 
attractor  as  well  as  its  topology. 

The  simplest  nontrivial  group  action  is  G  =  Zj 
acting  on  the  line  Af  =  IR  by  x>-^-x.  The  cubic 
logistic  map  x  •->  kx(l  -  x‘)  is  equivariant  for  this 
action.  See  also  [7,9,  24].  For  this  map  there  is 
an  initial  bifurcation  from  a  Zj-symmetric  fixed 
point  jc  =  0  to  an  asymmetric  fixed  point.  This  is 
followed  by  a  period-doubling  cascade  (of  asym¬ 
metric  points),  leading  to  chaos.  There  is  then  a 
new  phenomenon:  a  sudden  “explosion”  of  the 
attractor,  from  an  asymmetric  interval  contained 
in  the  positive  half-line  into  a  symmetric  interval 
containing  both  negative  and  positive  vp.Iues  of 
X.  This  is  an  example  of  the  phenomenon  that 
Grebogi  et  al.  [15]  call  a  crisis.  Two  disjoint 
strange  attractors  (here  related  by  symmetry) 
have  collided  and  “fused”  into  a  single  attractor. 
The  symmetric  attractor  that  exists  after  the 
crisis  is  not  just  a  union  of  two  separate  asym¬ 
metric  attractors:  it  is  indecomposable,  that  is,  it 
has  a  dense  orbit.  The  curious  feature  of  such  an 
event  is  that  it  increases  the  symmetry  of  the 
attractor.  In  this  example,  before  the  crisis  there 
are  two  distinct  attractors  A^,  A 2,  each  with 
trivial  symmetry  A^  =  1.  After  the  crisis  there  is 
one  attractor  A  with  non-trivial  symmetry  A^  = 
1-2.  In  non-chaotic  dynamics,  bifurcations  tend  to 
break  symmetry.  It  is  true  that  reversing  the 
bifurcation  parameter  trivially  turns  a  symmetry¬ 
breaking  bifurcation  into  a  symmetry-increasing 
one,  but  generally  there  is  a  “natural’  direction 
for  the  bifurcation  parameter,  in  which  the  de¬ 
gree  of  nonlinearity  increases,  often  leading  from 
order  into  chaos;  and  for  non-chaotic  states  the 
bifurcations  in  that  direction  normally  lead  to 
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less  and  less  symmetry.  However,  when  chaotic 
attractors  are  present,  symmetry-increasing 
crises  are  common.  They  have  been  reported 
many  times  in  the  literature:  see  for  example 
[7,8,18,23,25,28]. 

A  rich  source  of  numerical  examples  of  sym¬ 
metry-increasing  crises  is  afforded  by  mappings 
of  the  plane  equivariant  under  the  standard  ac¬ 
tion  of  the  dihedral  group  D„.  Chossat  and 
Golubitsky  [8]  have  studied  the  family  of  D„- 
equivariant  mappings 

f(z,  \)  =  (au  +  fiv  +  \)z  +  yz"^' .  (3) 

Field  and  Golubitsky  [11, 12]  have  produced 
high-resolution  computer  pictures  of  the  attrac¬ 
tors  of  (3)  in  which  pixels  are  colour-coded 
according  to  how  many  times  the  point  lands  on 
them.  This  gives  a  visual  representation  of  the 
invariant  measure  on  the  attractor.  Other  exam¬ 
ples  of  symmetric  crises  may  be  found  in  Chossat 
and  Golubitsky  [8]  and  King  and  Stewart  [17].  A 
theoretical  explanation  of  why  crises  of  conju¬ 
gate  attractors  in  symmetric  systems  lead  to  an 
increase  of  symmetry  has  been  given  by  Chossat 
and  Golubitsky  [8]. 

5.  Equivariant  observations 

We  now  introduce  the  simple  idea  that  lies 
behind  equivariant  phase  space  reconstruction. 
First,  recall  the  basis  of  the  Packard-Takens 
method.  Suppose  that  Af  is  a  manifold  (phase 
space)  upon  which  is  defined  a  vector  field  (con¬ 
tinuous  dynamic)  or  diffeomorphism  (discrete 
dynamic)  <f>.  Let  h:  M—*U.  be  an  observation. 
Then  a  time  series  of  observations  is  defined 
by  Xj  =  h{<f>'{x„))  for  an  initial  condition  Jt„.  A 
“window  size”  N  is  chosen,  and  the  time  series 
of  A-dimensional  vectors 

Zj  ~  (,Xj,  I  ,  .  .  .  ,  +  I  ) 

is  formed.  Provided  N  is  large  enough  (2n  +  I, 


where  n  is  the  dimension  of  M,  suffices)  and  h 
and  (/>  satisfy  suitable  generic  hypotheses,  the 
dynamics  of  z^  in  provide  an  accurate  topo¬ 
logical  reflection  of  those  of  x,,  on  M.  The  most 
important  feature  of  this  method  is  that  a  one¬ 
dimensional  observation  h  is  used  to  deduce 
topological  features  of  a  multidimensional 
dynamic. 

The  examples  and  experiments  discussed  in 
previous  sections  motivate  a  general  setting  for 
the  analysis  of  symmetric  dynamics,  and  a  sym¬ 
metric  analogue  of  the  Packard-Takens  recon¬ 
struction  method.  Suppose  that  G  is  a  compact 
Lie  group  of  transformations,  acting  on  the 
phase  space  M  of  the  dynamical  system.  In  the 
discrete  case,  the  dynamics  is  defined  by  a  dif¬ 
feomorphism  (f)  on  M,  playing  the  role  of  the 
mapping  /  in  (1).  However,  (j)  is  now  G- 
equivariant.  Let  h:  he  a  function,  that  is, 

an  idealised  experimental  observation.  What 
kind  of  structure  should  the  target  V  ol  h  have? 

If  we  wish  to  reconstruct  the  symmetries  of 
attractors  of  (f»  on  M  from  information  lying  in  F, 
then  V  must  itself  carry  symmetry  information. 
That  is,  V  should  also  be  a  G-manifold.  The 
simplest  and  most  natural  case  is  when  F  is  a 
Euclidean  G-space,  that  is,  affords  a  linear  rep¬ 
resentation  of  G.  In  this  case  we  can  (and  hence¬ 
forth  do)  assume  that  G  acts  orthogonally  in 
some  metric:  see  Golubitsky  et  al.  [14],  p.  31. 

The  natural  way  to  ensure  that  h  conveys 
information  about  symmetry  is  to  insist  that  h 
should  itself  be  G-equivariant  (for  the  separate 
actions  of  G  on  A/  and  F).  That  is,  that 

h(ym)  =  yh(m) 

for  all  mE.  M,  y  £  G.  One  immediate  con¬ 
sequence  of  equivariance  is  that  F  cannot  always 
be  assumed  one-dimensional  as  is  the  case  for 
phase  space  reconstruction  in  non-equivariant 
systems. 

In  fact,  these  conditions  are  natural  in  experi¬ 
ments  aimed  at  detecting  symmetry.  For  exam¬ 
ple,  in  the  system  studied  by  Ashwin  [1],  the 
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primary  measurement  made  is  the  triple  of  vol¬ 
tages  (Vp  V2,  V3)  at  corresponding  points  in  the 
three  oscillator  circuits.  Since  the  symmetry 
group  S„  acts  by  permuting  the  oscillators,  the 
physical  interpretation  of  equivariance  is  that  the 
measurements  should  be  made  in  the  same  way 
on  each  oscillator. 

If  we  take  an  attractor  for  (3)  with  full  D, 
symmetry  and  plot  its  x-  and  y-coordinates  as 
two  separate  time  series,  then  the  symmetry  is  by 
no  means  apparent.  This  happens  because  the  x- 
and  y-coordinates  are  not  related  in  a  symmetric 
fashion;  that  is,  they  do  not  determine  an 
equivariant  observation.  If  instead  we  plot  (fig. 
1)  time-series  for  the  variables  x,  xl2  +  V3y/2, 
x/2  -  V3y/2,  which  are  permuted  by  D,,  we 
observe  something  rather  more  interesting. 

First,  of  course,  the  three  time  series  in  fig.  la, 
say,  are  not  identical.  If  they  were,  it  would 

WVWVVtM  “ 
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Fig.  1.  Four  time  series  for  symmetrically  placed  combina¬ 
tions  of  the  coordinates  of  a  point  in  an  attractor  for  the 
D, -equivariant  map  (3).  (a)  a  =  -0.9,  ^  =  0,  y-  -0.8,  A  = 
1.3.  (b)  a  =  -1.1, /3  =  0.213,  y  =  0.6,  A=  1.89.  (c)  a  =  -0.9, 
^  =  0,  y=  -0.8,  A  =  1.22.  (d)  a  =  -1.1,  ^  =0.212,  y  =  0.6, 
A  =  1.89. 


mean  that  every  individual  point  on  the  attractor, 
rather  than  just  the  attractor  itself,  would  have 
D,  symmetry,  so  the  attractor  would  be  just  the 
origin.  However,  all  three  time  series  do  have  a 
remarkably  similar  appearance.  The  simplest 
way  to  describe  this  is  to  say  that  if  we  look  at 
randomly  selected  segments  from  each,  it  is  very 
hard  to  tell  which  segment  is  taken  from  which 
time  series.  The  same  goes  for  fig.  lb.  in  which 
we  can  even  detect,  by  eye.  traces  of  inter- 
mittency  caused  by  the  collision  of  three  conju¬ 
gate  Zj-symmetric  attractors.  However,  in  figs. 
lc,ld  there  is  a  clear  distinction;  two  series  look 
similar  but  the  third  is  quite  different.  These 
correspond  to  Z, -symmetric  attractors,  and  the 
symmetry  is  clearly  “visible”  in  some  statistical 
sense  in  the  three  time  series.  An  explanation  of 
this  effect  is  given  in  [17]  in  terms  of  the  concept 
of  local  isomorphism  of  time  series. 

The  choice  of  equivariant  observation  is  con¬ 
ditioned  by  the  physical  interpretation  of  the 
symmetry  of  the  system,  including  questions  of 
what  measurements  are  actually  feasible.  The 
topology  of  a  dynamical  attractor  generally  lives 
in  some  unknown  subset  of  phase  space,  and 
hence  bears  only  a  loose  relation  to  physically 
measurable  quantities:  this  is  why  phase  space 
reconstruction  is  necessary.  The  symmetries  of 
systems  that  arise  from  experiments  are  normally 
induced  by  physical  symmetries  of  the  apparatus, 
and  hence  have  a  much  more  direct  interpre¬ 
tation. 

One  important  point  must  be  mentioned  here. 
We  will  show  in  section  6  below  that  the  target  V 
of  an  equivariant  observation  h  must  be  “suffi¬ 
ciently  complicated”  in  order  for  reconstruction 
to  have  a  chance  of  embedding  M  in  a  cartesian 
power  of  V.  For  example,  suppose  we  focus 
upon  the  conjectured  reflectional  symmetry-on- 
average  of  turbulent  Taylor  vortices,  as  de¬ 
scribed  in  section  2.  The  symmetry  group  is  Z,. 
Its  action  on  phase  space  (which  here  is  an 
infinite-dimensional  function  space)  is  by  inver¬ 
sion  of  the  axial  coordinate,  z— »-2.  Possible 
equivariant  observations  include: 
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(1)  The  axial  component  of  velocity  on  the 
vortex  boundary. 

(2)  The  radial  component  of  velocity  on  the 
vortex  boundary. 

(3)  A  pair  of  symmetrically  related  velocity 
measurements  on  either  side  of  the  vortex 
boundary. 

We  show  below  that  neither  (1)  nor  (2)  can 
guarantee  an  embedding:  only  (3).  (An  alterna¬ 
tive  is  to  employ  (1)  and  (2)  in  combination.) 
We  do  not  consider  this  restriction  to  be  espe¬ 
cially  intuitive,  but  it  emerges  naturally  from  the 
topological  /  group-theoretic  analysis . 

An  example  where  an  inappropriate  choice  of 
equivariant  observations  leads  to  difficulties  is 
reported  by  Lorenz  [19].  See  ref.  [27]  for  further 
discussion. 


6.  An  equivariant  Takens  embedding  theorem 

We  now  proceed  to  our  main  result:  a  state¬ 
ment  and  sketch  of  the  proof  of  an  equivariant 
Takens  embedding  theorem.  We  explain  why  the 
truth  of  such  a  theorem  depends  upon  the  target 
of  the  equivariant  observation  concerned,  and 
describe  how  Wassermann’s  concept  of  phase 
space  being  subordinate  to  a  representation  of 
the  symmetry  group  characterises  the  “good” 
targets. 

The  setting  for  the  standard  (non-equivariant) 
theorem  is  as  follows.  Work  in  the  category  of 
-smooth  manifolds  and  maps.  Let  M  be  an 
n-dimensional  manifold  (phase  space),  and  let  <f> 
be  a  diffeomorphism  on  M  (discrete  dynamical 
system).  Let  h:  M—>R  be  a  function  (observa¬ 
tion).  Define  the  delay-coordinate  map 

by 

W  =  (Kx),  h(4>ix)), hi<f>^\x))) . 

Then  the  Takens  embedding  theorem  (the  first 


and  basic  version  of  three  similar  theorems 
stated  and  proved  by  Takens  [29])  is: 

Theorem  1.  Generically  in  the  map 

is  an  embedding  of  M  in 

The  other  two  embedding  theorems  in  Takens 
[29]  are  two  versions  for  continuous  dynamics. 
One  is  obtained  by  setting  <{>  equal  to  the  time-T 
forward  map  for  generic  T,  and  the  other  uses 
successive  derivatives  of  /i(<f)'(x))  at  /  =  0.  Sauer 
et  al.  [26]  improve  upon  this  result  by 
strengthening  the  genericity  assumptions,  while 
making  them  more  explicit. 

From  our  discussion  in  section  5,  we  see  that 
the  appropriate  setting  for  an  equivariant  Takens 
embedding  theorem  should  be  as  follows.  Let  G 
be  a  compact  Lie  group.  Let  M  be  an  /i-dimen- 
sional  G-manifold,  and  let  be  a  G-equivariant 
diffeomorphism  on  M.  Let  F  be  a  Euclidean 
G-space,  and  let  /i:  A/— »IR  be  a  G-equivariant 
mapping  (equivariant  observation).  Let  V‘  de¬ 
note  K  ©  •  •  •  ©  V  with  t  summands.  For  given  t, 
define  the  delay-coordinate  map 

by 

=  {h{x),  h{<l>{x)), . .  .  ,  h{<i>'-\x))) . 

We  might  hope  that  for  sufficiently  large  t  (per¬ 
haps  t  =  2n  +  l)  the  map  ,  is  generically  a 
G-equivariant  embedding  of  M  in  V‘. 

However,  it  is  clear  in  advance  that  this  may 
not  be  the  case,  no  matter  how  large  t  we  take 
There  are  “obstacles”  to  equivariant  embed¬ 
ding,  which  depend  upon  the  structure  of  V.  The 
following  examples  capture  the  nature  of  the 
difficulty  and  motivate  the  “correct”  theorem. 

Example  1.  Let  A/  =  S'  be  the  circle,  realised  as 
the  unit  circle 

{z\z  =  e'\  ^EIR} 

in  the  complex  plane.  Let  G  =  Zj  acting  on  M  by 
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z  2.  Let  V  =  IR  with  the  nontrivial  representa¬ 
tion  of  G  for  which  Let  h  be  any 

equivariant  map  from  M  to  V'  for  arbitrary  t. 

By  equivariance,  h  maps  Fix^(Z2)  to 
FixR(Z2)-  The  former  is  {0, tt},  the  latter  {0}. 
Therefore  h{0)  =  h{Tr)  =  0  and  h  cannot  be  an 
embedding. 

However,  h  can  be  an  immersion.  Indeed  we 
can  construct  such  an  h  using  a  delay  coordinate 
approach.  Define  <f>(z)  =  z\  an  equivariant  dif- 
feomorphism,  and  let  k:M—>V  be  given  by 
k(z)  ~  Im(z).  Then  k  is  an  equivariant  observa¬ 
tion.  With  r  =  2  we  have  the  delay  coordinate 
map  2  >-*  (Im(2),  Im(2^)).  In  terms  of  the  coordi¬ 
nate  6  this  is  6  this  is  0i-^(sin(0),  sin(20)).  This 
is  an  immersion,  and  its  image,  shown  in  fig.  2, 
fails  to  be  one-one  precisely  on  {0,  -ir}. 

We  can  modify  this  map  to  get  an  equivariant 
embedding  by  introducing  an  additional  coordi¬ 
nate  with  trivial  group  action,  and  pulling  the 
crossing  apart  along  that  direction.  In  fact,  the 
system  can  be  equivariantly  embedded  in  = 
IR  X  R  with  trivial  action  on  the  first  component 
and  nontrivial  action  on  the  second.  Indeed,  as 
we  have  described  it,  it  already  is  embedded  in 
this  manner:  the  first  factor  is  the  x-axis  in  the 
plane,  the  second  is  the  y-axis. 

An  experimental  system  that  realises  this 
phase  space  and  group  action  is  the  motion  of  a 
bead  on  a  wire  loop  that  is  symmetric  about  the 
vertical  axis,  under  the  influence  of  gravity  and 
friction.  The  same  group,  but  not  in  this  precise 
representation,  arises  in  the  Taylor  experiment 


Fig.  2.  Equivariant  immersion  of  a  circle  using  delay  coordi¬ 
nates. 


in  connection  with  the  reflectional  symmetry  of 
turbulent  Taylor  vortices.  The  general  require¬ 
ment  that  both  the  trivial  and  the  non-trivial 
representation  of  Z,  should  occur  is  the  reason 
why  only  observation  (3)  of  section  5  guarantees 
an  embedding.  Observation  (1)  involves  only  the 
non-trivial  representation,  and  observation  (2) 
only  the  trivial  representation. 

Example  2.  Let  Af  =  T'  =  S'  x  S'  be  the  two- 
torus,  with  angular  coordinates  (a,  /3).  Let  G  = 
s'  =  {0}  act  on  M  by 

6(a,  fi)  =  {a  +  e,  p) 

so  that  the  action  is  trivial  along  the  j8  direction. 
There  are  no  fixed  points. 

The  natural  way  to  embed  a  two-torus  in 
Euclidean  space  is  to  embed  each  generating 
circle  S'  in  R'  =  C  as  the  unit  circle,  so  that  the 
whole  torus  embeds  in  R"*  =  C‘  by 
(a,  j8)>->(e'“, e'^).  This  is  equivariant  under  an 
action  of  S'  that  is  standard  on  C  x  {0}  and 
trivial  on  {0}  x  C.  However,  the  torus  can  be 
embedded  equivariantly  using  the  standard  ac¬ 
tion  by  rotation  on  R^  in  both  factors,  even 
though  the  action  on  one  generating  circle  of  the 
phase  space  M  is  trivial.  To  achieve  this,  map 
(a,  p)  to  (e’‘“^^U‘*“‘^)). 

This  phase  space  and  group  action  occurs  in 
the  Couette -Taylor  experiment  if  we  look  at, 
say,  the  wavy  vortex  states  in  an  infinite  cylinder 
model  and  let  G  be  the  group  of  translations 
along  the  axis. 

It  follows  from  these  examples  that  if  a  version 
of  the  Takens  delay  construction  is  to  be  used, 
then  the  target  V  of  the  observation  h  cannot  be 
any  Euclidean  G-space.  It  must  be  “sufficiently 
complicated”.  The  examples  also  show  that  the 
precise  conditions  required  must  be  fairly  subtle. 
They  may  be  found  in  [30],  and  go  back  to 
equivariant  embedding  theorems  of  Mostow  [20] 
and  Palais  [22],  Two  alternative  approaches  are 
described  in  [5],  but  the  conditions  required  on 
V  are  not  spelt  out  there.  The  obstacles  to 
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equivariant  embedding  are  of  two  kinds;  condi¬ 
tions  on  group  orbits,  and  conditions  transverse 
to  group  orbits. 

First  we  explain  the  obstacle  transverse  to 
group  orbits.  Suppose  that  M  is  locally  G- 
embedded  in  V'  near  some  point  mE.  M.  Let 
^  G  be  the  isotropy  subgroup  of  m.  Then  the 
tangent  space  T„M  is  a  Euclidean  -space,  and 
decomposes  into  irreducibles  W'.  Because  of  the 
existence  of  a  local  G-embedding,  each  W.  must 
occur  in  V  and  hence  in  V.  Therefore  V  must 
contain  at  least  one  copy  of  every  irreducible 
-space  that  arises  locally  from  the  G„, -action 
on  M. 

This  condition  of  “local  transverse  embed- 
dability"  can  be  used  to  extend  embeddings 
away  from  G-orbits;  but  of  course  those  orbits 
must  themselves  be  embeddable.  It  is  enough  to 
consider  just  orbit  types,  where  two  orbits  have 
the  same  type  if  their  isotropy  subgroups  are 
conjugate.  The  “on  orbits”  condition  is:  every 
orbit  (type)  of  M  should  embed  equivariantly  in 
V'\{0}  for  some  t.  The  origin  is  deleted  to  ensure 
that  distinct  orbits  of  the  same  type  can  be 
embedded  disjointly.  The  orbit  types  are  G- 
diffeomorphic  to  coset  spaces  G/2  as  S  runs 
through  representatives  of  conjugacy  classes  of 
i  .otrof  /,  subgroups  of  G  on  M. 

i*.!-»re  precisely,  following  Wassermann  [30J, 
define  M  to  be  subordinate  to  V  if  for  each 
mE  M  there  exists  a  G-invariant  neighbourhood 
U  of  m  that  embeds  equivariantly  into  K'\{0}  for 
some  t.  By  [22]  M  is  subordinate  to  V  if  and  only 
if: 

(a)  V  contains  an  isomorphic  copy  of  every 
-irreducible  occurring  in  every  T„M; 

(b)  Every  orbit  type  in  M  embeds  equivariant¬ 
ly  into  K'XlO}  for  some  t. 

Let  us  analyse  examples  1  and  2  to  check  these 
conditions.  For  example  1,  take  K  to  be  R  with 
nontrivial  Z2-action.  Condition  (a)  is  easily 
checked;  but  (b)  fails  since  either  0  or  it  in  Af 
must  map  to  0  under  an  equivariant  embedding. 
In  example  2,  on  the  other  hand,  with  F=R^ 


and  the  standard  circle  action  by  rotation,  (a)  is 
valid  since  every  point  has  trivial  isotropy,  and 
(b)  is  valid  since  the  only  orbit  type  is  a  circle 
with  the  standard  action.  Thus  Wassermann's 
conditions  explain  our  previous  findings. 

The  upshot  of  these  considerations  is  that,  if 
we  are  to  prove  an  equivariant  Takens  embed¬ 
ding  theorem,  then  the  target  V  of  the 
equivariant  observation  h  must  be  subordinate  to 
M.  The  implications  of  this  condition  for  experi¬ 
ments  are  discussed  in  section  8  below. 

In  fact  this  necessary  condition  is  also  suffi¬ 
cient.  We  state  the  main  theorem: 

Theorem  2.  (Equivariant  Takens  embedding 
theorem).  Let  G  be  a  compact  Lie  group.  Let  M 
be  an  /i-dimensional  G-manifold,  and  let  be  a 
G-equivariant  diffeomorphism  on  M.  Let  K  be  a 
Euclidean  G-space  such  that  M  is  subordinate  to 
V,  and  let  h:M-*U.  be  a  G-equivariant  map¬ 
ping.  For  any  t,  define 

by 

=  (h(x),  h(<f)(x)) . h(<t>‘" '(x))) . 

Then  generically  in  (<l>,h),  the  map 
an  equivariant  embedding  of  Af  in 

Sketch  of  proof.  Argue  as  in  [29],  but  starting 
with  arbitrary  t.  Use  condition  (a)  above  to 
ensure  that  j, ,  is  an  immersion  near  fixed 
points  and  points  of  low  period;  use  (b)  to 
choose  an  appropriate  cover  of  the  remainder  of 
Af  so  that  the  extension  argument  in  [29]  can  be 
rendered  equivariant.  The  arguments  of  Wasser¬ 
mann  [30],  proposition  1.2,  then  let  us  replace  i 
by  2/7  -t-  1 . 

In  practice,  since  n  is  generally  unknown,  the 
precise  bound  on  t  may  not  be  important.  How¬ 
ever,  if  estimates  of  the  size  of  n  are  available, 
say  by  standard  dimension-counting  arguments 
or  bifurcation  analyses,  the  bound  may  be  use¬ 
ful.  As  explained  in  [26]  in  the  non-equivariant 
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case,  it  is  the  box-counting  dimension  of  the 
attractor  under  consideration  that  really  controls 
the  size  needed  for  t.  This  remark  presumably 
extends  to  the  equivariant  case. 


7.  Pointwise  and  setwise  symmetry 

If  topology  is  unimportant,  there  are  simpler 
ways  to  measure  the  symmetries  of  an  attractor 
A  in  M.  One  is  to  use  the  observation  h  to  define 
a  measure  on  V  and  find  the  symmetries  of  that 
measure.  Divide  V  into  small  boxes,  and  for 
each  box  B  in  V,  count  how  many  times  h{m)  lies 
in  B.  This  defines  a  measure,  and  under  appro¬ 
priate  genericity  hypotheses  on  h  and  V  its  (ap¬ 
proximate)  symmetries  will  correspond  to  those 
of  A.  This  approach  is  simple,  but  it  has  two 
defects.  The  first  is  that  it  does  not  also  capture 
the  topology  of  the  attractor,  so  it  cannot  of 
itself  distinguish  chaos  from  regular  dynamics. 
The  other  is  that  it  cannot  distinguish  setwise 
symmetry  of  the  attractor  from  pointwise  sym¬ 
metry,  which  is  an  important  distinction  with 
relevance  to  experiments.  It  is  implicit  in  the 
generalities  of  section  3,  but  deserves  more  de¬ 
tailed  explanation,  which  we  now  give. 

Let  us  first  describe  the  distinction  as  it  would 
appear  in  an  experiment.  If  an  attractor  has 
pointwise  symmetry,  then  at  every  instant  of  time 
the  state  of  the  system  has  that  symmetry.  In  the 
case  of  a  chaotic  attractor,  this  would  appear  as  a 
“spatially  ordered,  temporally  chaotic”  state.  If 
the  symmetry  of  the  attractor  is  only  setwise,  the 
state  at  any  given  instant  of  time  will  appear  to 
be  asymmetric.  The  symmetry  will  be  visible 
only  “on  the  average”  as  already  explained.  This 
would  be  a  “spatially  disordered,  temporally 
chaotic”  state. 

Thus  the  distinction  between  pointwise  and 
setwise  symmetry  is  central  to  the  transition  from 
spatially  ordered,  temporally  chaotic  states  to 
spatially  disordered  ones.  This  is  especially  true 
since  transitions  from  pointwise  to  setwise  sym¬ 


metry  are  common  in  symmetric  dynamical 
systems. 

We  now  describe  the  situation  in  more  abstract 
terms,  to  explain  why  pointwise  symmetric  at¬ 
tractors  can  easily  arise,  and  why  they  can  lose 
stability  to  setwise  symmetric  ones.  Suppose  that 
A  is  an  attractor  in  the  G-manifold  A/,  having  a 
dense  dynamical  orbit  generated  by  a  point  a  E 
M.  In  section  3  we  have  distinguished  between 
the  symmetry  group  and  the  isotropy  group 
that  is,  between  the  setwise  and  pointwise 
symmetries  of  A. 

For  example,  consider  turbulent  Taylor  vor¬ 
tices,  and  again  restrict  attention  to  Z,  symmet¬ 
ry.  If  A  is  an  attractor  with  A^  =^2.  then  the 
corresponding  flow  will  have  symmetry  Z,  "on 
average”.  The  radial  velocities  at  symmetrically 
related  points  -  together  forming  the  equivariant 
observation  -  need  not  be  the  same  at  each  in¬ 
stant.  They  will  be  more  like  fig.  1.  with  statisti¬ 
cally  indistinguishable  but  different  time  series 
for  the  two  velocities. 

However,  if  =  Z,  then  by  equivariance  and 
continuity  every  point  on  A  has  isotropy  sub¬ 
group  (containing)  Z,  (and  for  almost  all  points 
it  is  equal  to  Z,).  The  two  velocities  will  be  equal 
at  each  instant  of  time,  and  the  two  times  series 
will  be  identical. 

In  both  cases,  A^  =  Z,.  However,  in  one  case 
there  exists  a  point  aE.  A  for  which  t^Z,;  in 
the  other,  =  Zj  for  some  (and  hence  all) 
a  El  A.  It  is  important  to  distinguish  these  two 
cases  experimentally.  This  turns  out  to  be  quite 
easy  in  an  equivariant  reconstruction.  In  both,  A 
is  fixed  setwise  by  TL^.  However,  in  the  first  case 
A  is  not  contained  in  Fix(Z2),  whereas  in  the 
second,  A  is  contained  in  Fix(Z2). 

Fig.  3  is  an  example  of  the  first  case  in  the 
system  (3).  The  setwise  symmetry  is  evident  to 
the  eye;  but  not  all  points  of  the  attractor  lie  on 
the  symmetry  axis.  An  example  of  the  second 
case,  also  occurring  in  (3),  is  shown  in  fig.  4. 
Now  the  entire  attractor  is  concentrated  along 
the  symmetry  axis. 

As  noted  above,  the  second  type  of  attractor. 
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Fig.  3.  An  attractor  A  for  (3)  with  4^  =  Z,  but  =  I  where 
the  orbit  of  a  is  dense  in  A.  Parameter  values  are  a  =  -1, 
/3=0,  ■y=-0.65,  A  =  2. 

with  pointwise  symmetry,  is  responsible  for  “spa¬ 
tially  ordered,  temporally  chaotic”  states.  For 
example  Caponeri  and  Ciliberto  [6]  report  con¬ 
vection  flows  in  an  annulus  that  at  each  instant  of 
time  possess  spatial  dihedral  group  symmetry 
D,3,  but  are  temporally  chaotic. 

Spatially  ordered,  temporally  chaotic  states 
may  seem  puzzling,  but  they  have  a  very  simple 
explanation  in  abstract  terms.  Given  a  subgroup 
X  of  G,  let  S  =  Fix(l),  a  submanifold  of  M.  By 
equivariance,  it  is  easy  to  show  that  N  is  in¬ 
variant  under  the  dynamics  (ft  on  M.  (For  exam¬ 
ple,  if  z  in  (3)  is  real,  say  z  =  x,  then  /(x)  = 
(ax^  +  +  A)a[:  +  yx^  is  also  real.)  Suppose  N 

is  stable  to  perturbations  transverse  to  N,  that  is, 
to  symmetry-breaking  perturbations.  Then  the 


\ 

\ 

Fig.  4.  Three  attractors  A  for  (3)  with  4^  =  and  =  Zj 
for  all  a  E  i4.  Parameter  values  are  o  =  -1,  /3  =  0,  y  =  -0.6, 
and  A  =1.99,  2.1,  2.18  reading  anticlockwise  from  lower 
right.  Transients  are  shown  to  indicate  transverse  stability. 


system  will  naturally  assume  a  state  that  lies 
entirely  within  N,  that  is,  has  pointwise  symmet¬ 
ry  X.  The  dynamics  on  N  can  in  principle  be 
anything;  in  particular,  it  may  be  chaotic.  If  so, 
one  observes  persistent  (stable  transverse  to  jV) 
states  with  spatial  symmetry  X  but  temporal 
chaos.  Observe  that  in  this  description  the  sym¬ 
metry  is  effectively  “factored  out”,  and  most 
phenomena  observed  will  be  typical  of  ordinary 
dynamics  without  symmetry  -  unless  they  involve 
singular  points  of  the  quotient  map  M-*MIG. 

Suppose  now  a  bifurcation  occurs  that  creates 
an  instability  transverse  to  -  a  symmetry¬ 
breaking  instability.  An  example  for  (3)  is  shown 
in  Fig.  5.  It  is  ordinary  chaos  for  the  restricted 
mapping,  which  in  this  case  is  f{x)  =  (ax' -t- 
A)jt  -I-  yx~.  Now  the  dynamics  will  drift  away 
from  N,  and  the  spatial  order  will  break  down. 
There  may  or  may  not  remain  a  ‘hidden’  order 
“on  the  average”  -  it  depends  on  what  happens 
to  A^.  The  symmetry  group  cannot  now  be 
factored  out  since  the  dynamics  moves  away 
from  the  fixed-point  space  N. 


CP' 


Fig.  5.  Transition  to  spatial  disorder  in  one  of  the  attractors 
in  fig.  4.  caused  by  instability  transverse  to  the  fixed-point 
space  for  Z,.  Parameter  values  are  a  =  - 1,  0  =  0,  y  =  -0.6, 
and  A=  1.99.  Left:  unstable  invariant  set  lying  in  the  fixed- 
point  space.  Lower  right:  nearby  symmetric  attractor  (a  torus 
on  the  verge  of  breakup)  and  several  superimposed  transients 
to  show  instability  transverse  to  fixed-point  space.  Upper 
right:  the  resulting  symmetric  attractor,  with  transients  re¬ 
moved. 
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8.  Implicatioiis  for  experimental  design 

The  practical  implication  of  the  equivariant 
Takens  embedding  theorem  is  that  for  generic 
equivariant  dynamics  and  generic  observations,  a 
delay  coordinate  approach  to  equivariant  phase 
space  reconstruction  can  be  used,  provided  that: 

(a)  The  “time  series”  consists  of  equivariant  ob¬ 
servations  h,  which  may  require  several  distinct 
but  symmetrically  related  physical  observations. 

(b)  Phase  space,  or  at  least  the  part  of  the 
phase  space  being  reconstructed,  is  subordinate 
to  the  target  V  oi  h. 

Condition  (a)  is  highly  intuitive;  (b)  is  not.  Both 
are  required. 

To  perform  the  reconstruction,  take  a  time 
series  of  equivariant  observations  {y,}  =  h{m,), 
where  m,  is  the  system’s  state  in  phase  space  at 
time  t.  Then  form  a  “moving  window”  of  length 
N,  given  by  vectors  Jc,  =  (y,,  •  ■  • ,  y,  +  /v-, ), 
where  the  y^  are  themselves  vectors  in  V.  Then, 
for  large  enough  N,  generically  the  attractor 
formed  by  the  jc,  in  is  topologically  equivalent 
to  the  original  attractor  on  M,  and  the  dynamics 
is  given  by  a  shift. 

It  is  not  always  clear  how  to  ensure  in  advance 
that  conditions  (a)  or  (b)  actually  hold.  (For 
finite  symmetry  groups  one  -  rather  drastic  - 
answer  is  to  ensure  that  the  target  involves  all 
irreducible  representations.)  To  verify  (b)  in  par¬ 
ticular  cases  requires  some  kind  of  information 
on  the  phase  space  that  is  being  sought,  and  this 
is  not  always  available.  Analysis  of  specific  mod¬ 
els,  or  theoretical  bifurcation  scenarios,  can 
sometimes  provide  such  information  -  for  exam¬ 
ple,  they  may  give  clues  as  to  the  important 
“modes”  of  the  system,  and  the  space  of  modes 
gives  at  least  local  information  on  the  group 
action  on  phase  space.  Another  approach  would 
be  to  factor  out  the  group  action  and  reconstruct 
MIG  using  the  methods  of  Sauer  et  al.  [26).  If  G 
is  finite,  little  information  about  the  dynamics 
will  be  lost  when  passing  to  the  orbit  space,  since 
orbits  are  disconnected;  but  if  G  has  a  continu¬ 
ous  part,  rather  a  lot  of  information  will  be  lost. 


For  practical  implementation  of  such  methods, 
refinements  such  as  those  introduced  by 
Broomhead  and  King  [3,4]  must  also  be  de¬ 
veloped  in  the  symmetric  case.  A  few  remarks 
along  these  lines  are  made  in  King  and  Stewart 
[17].  The  topic  deserves  further  study,  as  do 
equivariant  analogues  of  alternative  reconstruc¬ 
tion  methods.  Clearly  many  questions  remain  to 
be  answered  in  this  area:  we  leave  them  open  for 
future  work. 
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We  have  developed  a  topological  procedure  for  analyzing  chaotic  time  series  which  identifies  the  stretching  and 
squeezing  mechanisms  responsible  for  chaotic  behavior  in  low-dimensional  dynamical  systems.  These  mechanisms, 
quantitatively  described  by  a  “template"  or  “knot-holder",  can  then  be  used  to  model  the  processes  which  generate  the 
original  chaotic  data  set. 


1.  Introduction 

A  century  ago  Poincare  observed  that  the  key 
to  a  deep  understanding  of  a  dynamical  system 
lay  in  identifying  and  understanding  its  unstable 
periodic  orbits  [1],  Up  to  the  present  time  this 
observation  has  not  been  exploited  in  our  at¬ 
tempts  to  understand  the  chaotic  behavior  which 
is  exhibited  by  low-dimensional  dissipative  and 
conservative  dynamical  systems. 

At  present,  there  are  two  broad  approaches  to 
the  understanding  of  chaotic  behavior  in  dynami¬ 
cal  systems.  These  are  the  metric  [2-4]  and  the 
topological  [5-10]  approaches. 

The  metric  approach  is  based  on  the  study  of 
distances  between  points  in  a  strange  attractor. 
In  this  approach  it  is  usual  to  compute  Lyapunov 
exponents  [2],  various  dimensions  [3],  scaling 
functions  [4],  etc.  As  a  general  rule  these  compu¬ 
tations  require  very  large  data  sets,  are  computa¬ 
tion  intensive,  and  degrade  rapidly  with  noise. 
The  real  numbers  which  are  computed  lack  con¬ 
fidence  intervals  since  there  does  not  yet  exist  an 
underlying  statistical  theory,  cannot  be  indepen¬ 
dently  verified,  and  do  not  describe  “how  to 
model  the  dynamics”  [8]. 


The  topological  approach  is  newer.  It  is  based 
on  the  observation  that  two  mechanisms  are 
responsible  for  the  creation  of  a  strange  attrac¬ 
tor:  stretching  and  squeezing  [9, 10].  The  stretch¬ 
ing  mechanism,  which  causes  nearby  points  in 
phase  space  to  diverge  from  each  other,  is  re¬ 
sponsible  for  “sensitive  dependence  on  initial 
conditions”.  The  squeezing  mechanism,  which 
prevents  phase  space  points  from  escaping  from 
a  compact  domain,  is  responsible  for  the  “recur¬ 
rence”  phenomenon  characteristic  of  chaotic  be¬ 
havior. 

These  two  mechanisms  act  to  organize  the 
strange  attractor  in  phase  space  in  a  unique  way. 
In  addition,  they  act  to  organize  the  unstable 
periodic  orbits  which  exist  in  the  neighborhood 
of  the  strange  attractor  (densely  for  a  hyperbolic 
invariant  set  [11])  in  a  unique  way.  This  means 
that  if  we  can  determine  how  the  unstable 
periodic  orbits  are  organized,  we  can  identify  the 
stretching  and  squeezing  mechanisms  which  are 
responsible  for  the  creation  of  the  strange  attrac¬ 
tor.  This  identification  is  topological  (homologi¬ 
cal)  and  given  in  terms  of  a  set  of  integers  [9]. 
The  extraction  of  these  integers  from  time  series 
data  is  robust  against  noise  and  is  independently 
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verifiable  [10].  Once  these  mechanisms  have 
been  identified,  a  geometrical  model  can  be  con¬ 
structed  which  describes  how  to  model  the 
stretching  and  squeezing  mechanisms  responsible 
for  generating  the  original  time  series.  This 
geometrical  model  can  be  used  to  generate  syn¬ 
thetic  data  sets  which  have  the  same  topological 
properties,  but  not  necessarily  the  same  metric 
properties,  as  the  original  data  set. 

This  topological  analysis  technique  which  we 
describe  below  is  applicable  to  “low”-dimension- 
al  dynamical  systems.  By  “low”  dimension  we 
mean  /i-dimensional  systems  («  s:  3)  which  pos¬ 
sess  a  chaotic  attractor  with  Hausdorff  dimension 

<  3.  This  technique  has  been  carried  out  suc¬ 
cessfully  on  experimental  data  sets  from  the 
Belousov-Zhabotinsky  reaction  (BZ)  [10],  the 
laser  with  saturable  absorber  (LSA)  [12],  and  the 
NMR  laser  [13].  With  one  exception  (explained 
below  in  section  5)  the  steps  involved  in  this 
topological  analysis  are  illustrated  for  the  BZ 
data  set. 


2.  Summary  of  steps 

Our  topological  analysis  procedure  consists  of 
a  number  of  successive  steps.  Each  is  relatively 
simple.  These  are  summarized  in  fig.  1  and  de¬ 
scribed  briefly  below.  This  description  is  elabo¬ 
rated  upon  in  the  following  six  sections  [10]. 

CLOSE  RETURNS - - 

EMBECpiNG 

TOPOLOGICAL  INVARIANTS 

I 

TEMPLATE  IDENTIFICATION 

I 

TEMPLATE  VERIFICATION 

♦ 

MODEL  THE  DYNAMICS  — 

Fig.  1.  Six  steps  in  the  topological  analysis  of  chaotic  time 
series  data.  The  arrows  on  the  right  describe  the  self  con¬ 
sistency  checks  afforded  by  this  topological  procedure  for 
analyzing  data. 


1.  Close  returns.  This  algorithm  is  used  to  locate 
segments  in  the  chaotic  time  series  which  can  be 
used  as  surrogates  for  the  unstable  periodic  or¬ 
bits  which  exist  in  the  neighborhood  of  the 
strange  attractor. 

2.  Embedding.  An  embedding  of  these  orbits 
into  a  three-dimensional  space  is  required  in 
order  to  identify  their  topological  organization. 
If  such  an  embedding  cannot  be  found,  or  does 
not  exist  (when  the  Hausdorff  dimension  dz3) 
the  following  steps  cannot  be  carried  out  and  the 
topological  analysis  cannot  be  completed. 

3.  Relative  rotation  rates  and  linking  numbers. 
The  topological  organization  of  all  unstable  (sur¬ 
rogate)  periodic  orbits  extracted  from  the  time 
series  is  determined  by  computing  the  relative 
rotation  rates  and  linking  numbers  of  all  pairs  of 
periodic  orbits  and  the  self-relative  rotation  rates 
and  self-linking  number  of  each  individual 
periodic  orbit. 

4.  Template  identification.  The  template  or 
knot-holder,  which  supports  all  unstable  periodic 
orbits  in  the  strange  attractor,  is  identified  on  the 
basis  of  the  relative  rotation  rates  or  linking 
numbers  of  an  appropriate  subset  of  orbits. 

5.  Template  verification.  Once  a  template  has 
been  tentatively  identified,  it  can  be  used  to 
predict  the  (self-)  relative  rotation  rates  and 
(self-)  linking  numbers  for  all  (orbits)  orbit  pairs 
supported  by  that  template.  If  these  predicted 
(from  the  template)  topological  invariants  agree 
with  the  measured  (computed  from  the  surrogate 
orbits)  topological  invariants,  we  have  added 
confidence  that  the  initial  template  identification 
was  correct.  If  these  invariants  do  not  agree,  we 
can  reject  the  hypothesis  that  the  initial  template 
identification  was  correct. 

6.  Model  the  dynamics.  A  template  serves  as  a 
geometrical  model  for  the  dynamics  (stretching 
and  squeezing  mechanisms)  which  generate  the 
chaotic  time  series.  It  is  straightforward  to  build 
an  ‘equation  free'  computer  model  which  drives 
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a  flow  through  a  template.  Time  series  data 
generated  by  such  an  equation  free  model  have 
topological  properties  identical  to  those  of  the 
original  time  series. 

The  two  arrows  in  fig.  1  indicate  “loop 
closing”  procedures.  In  any  analysis  it  is  ex¬ 
tremely  important  to  be  able  to  determine,  by 
independent  means,  if  possible,  whether  the  re¬ 
sults  of  the  analysis  are  in  fact  correct  or  not. 
Such  “loop  closing”  procedures  are  absent  from 
the  metric  approach  to  the  analysis  of  chaotic 
time  series,  while  the  topological  approach  pos¬ 
sesses  the  two  indicated. 

The  template  is  overdetermined  by  the  topo¬ 
logical  invariants  of  periodic  orbits  extracted 
from  time  series  data.  A  subset  is  used  to  make 
the  template  identification.  The  compatibility  of 
this  identification  with  the  topological  properties 
of  the  remaining  orbits  provides  an  independent 
test  for  the  validity  of  this  identification. 


3.  Close  returns 

The  close  returns  test  is  predicated  on  the 
existence  of  unstable  periodic  orbits  in  the  neigh¬ 
borhood  of  a  strange  attractor  or  strange  in¬ 
variant  set  [6-8, 11, 14J.  If  a  point  in  the  attrac¬ 
tor  is  near  an  unstable  periodic  orbit  with  rela¬ 
tively  low  period  and  low  Lyapunov  exponent,  it 
can  evolve  in  the  neighborhood  of  that  orbit  long 
enough  to  return  to  an  epsilon  neighborhood  of 
its  starting  point.  Since  chaotic  systems  are  de¬ 
terministic,  this  close  return  provides  an  initial 
condition  for  a  segment  which  evolves  in  the 
neighborhood  of  the  segment  generated  by  the 
initial  point. 

Such  close  return  segments  can  be  located  in 
the  original  time  series,  without  embedding,  as 
follows  [10].  If  Jt(/)  (/  =  1,  2, .  .  .  ,  A,  A  is  the 
length  of  the  data  set)  and  x{i  +  p)  are  coordi¬ 
nates  of  two  points  which  are  neighbors  in  some 
appropriate  phase  space,  then  x{i  -i- 1)  and  x{i  -i- 
1  p)  will  also  have  approximately  equal  values, 
as  will  x{i  +  k)  and  x{i  +  k-\-  p),  for  some  se¬ 


quence  of  values  k  =  \,2,  .  .  .  where  p  is  the 
period  of  the  nearbv  uiistr’’ i  i>eriodic  orbit, 
measured  in  units  of  the  sampling  time.  Such 
close  returns  segments  can  be  recognized  in  lOc 
data  by  making  a  two-dimensional  close  returns 
plot  of 


140  -4/  +p) 


<  t- 

>  e 


black  . 
white  . 


(1) 


Here  pixel  (/,  p)  is  colored  black  if  the  differ¬ 
ence  |4‘)  “  -I-  p)|  is  below  some  threshold  e, 

white  otherwise.  The  threshold  e  is  fixed,  typical¬ 
ly  at  a  few  percent  of  the  diameter  of  the  attrac¬ 
tor,  e~  10"'  X  {Max[4/)]  “  min[x(/)]}.  Close 
returns  appear  as  horizontal  line  segments  whose 
location  in  the  data  set  (/^  to  //+,,)  is  clearly 
indicated.  Such  segments  can  usually  be  used  as 
surrogates  for  unstable  periodic  orbits  if  they 
satisfy  additional  weak  criteria  (e.g.,  they  close 
up  when  embedded,  cf.,  section  4).  A  close 
returns  plot  for  a  portion  of  the  BZ  time  series 
[15]  is  shown  in  fig.  2.  Close  returns  plots  for 
LSA  time  series  data  and  NMR  laser  time  series 
data  are  similar. 

The  close  returns  plot  described  above  is 
closely  related  to  “recurrence  plots”  introduced 
by  Eckmann,  Kamphorst  and  Ruelle  [16].  Re¬ 
currence  plots  were  originally  introduced  as  a 
tool  to  determine  whether  or  not  a  data  set  was 
stationary,  and  have  subsequently  been  used  for 
that  purpose  [17].  It  was  also  observed  that 


r  '  » - > - 7T - ’■*’ - "T - ^ 


Fig.  2.  Close  returns  plot  of  the  Belousov-Zhabotinsky  data. 
Pixel  (/',  p)  is  colored  black  if  |j:(i)  -  x(i  +  /7)|  <  e. 
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recurrence  plots  provide  'other  important  and 
easily  interpretable  information’  [16]  but  they 
had  not  been  used  to  extract  periodic  orbits  from 
chaotic  data.  Recurrence  plots  are  generated 
from  embedded  data. 

The  close  returns  search  is  robust  against  addi¬ 
tive  noise.  In  fig.  3  we  show  a  sequence  of  close 
returns  plots  on  data  sets  BZ(i)  +  /  x  GnD(/,  a) 
with  increasing  amounts  of  noise.  Here  BZ(/)  is 
an  experimental  data  set  from  the  Belousov- 
Zhabotinsky  reaction  [15],  GIID  is  gaussian  in¬ 
dependent  identically  distributed  with  zero  mean 
and  standard  deviation,  cr,  equal  to  that  of 
BZ(t),  and  /  ranges  from  0.1  to  2.0.  The  plot 
degrades  gracefully.  Fig.  3f  shows  how  the  close 
returns  plot  can  be  recovered  from  a  data  set 
with  equal  signal  to  noise  ratio  (/=  1.0  shown  in 
fig.  3d)  by  using  a  low  pass  filter  (11  point 
moving  average).  Separation  of  signal  from  noise 
is  not  too  difficult  when  these  two  components 
have  quite  different  time  scales.  In  the  present 
case  the  signal  has  a  time  scale  of  ~130 
(samples /cycle)  while  the  noise  is  independent 
(time  scale  of  1). 

For  large  data  sets  a  close  returns  histo¬ 


gram  [14] 

Hip)  =  'Z  &{£ -\xii)  -  xii  +  p)\)  (2) 

i 

can  be  constructed.  Here  0  is  the  Heaviside 
theta  function.  A  chaotic  data  set  will  exhibit  a 
series  of  peaks  (fig.  4a)  while  a  stochastic  data 
set  will  generate  a  uniform  distribution  (fig.  4b). 
The  peak  centered  at  p  =  0  in  fig.  4a  describes 
dynamic  correlations,  which  must  strictly  be  ex¬ 
cluded  from  Grassberger-Procaccia  dimension 
computations  which  attempt  to  characterize 
geometric  correlations  (i.e.,  the  peaks  centered 
at  p>0).  Failure  to  separate  dynamic  from 
geometric  correlations  has  rendered  many  previ¬ 
ous  dimension  computations  “obsolete”  [18]. 

Diagonal  segments  in  close  returns  plots,  and 
non-zero  base  lines  in  close  returns  histograms, 
are  due  to  the  result  that  in  stationary  data  sets 
upward  trending  segments  of  data  are  always 
followed  by  downward  trending  segments  (for 
every  “up”  there  is  a  “down”)  which  causes 
accidental  close  returns.  Both  the  plots  and  the 
histograms  can  be  cleaned  up  by  almost  any  kind 
of  embedding  [14]. 


(a)  - -  i 


(b) 

r •  x*"' 


(c) 


Fig.  3.  The  close  returns  plot  degrades  gracefully  with  additive  noise.  Here  (see  text)  (a)  /  =  O.IO;  (b)  /  =  0.25;  (c)  /  =  0.50;  (d) 
/=  1.0;  (e)  f=  2.0.  (f)  The  signal  can  be  recovered  by  a  low  pass  filter,  in  this  case  an  11  point  moving  average  on  the  data  set 
with  f-  1.0. 
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4.  Embedding 

The  topological  organization  of  the  unstable 
periodic  orbits  (and  therefore  the  strange  attrac¬ 
tor)  is  determined  through  their  (self-)  relative 
rotation  rates  [5]  and  (self-)  linking  numbers. 
These  can  be  computed  after  the  orbits  have 
been  embedded  in  or  some  other  oriented 
three-manifold.  It  is  therefore  necessary  to  con¬ 
struct  a  three-dimensional  embedding  of  the 
strange  attractor  and  the  periodic  orbits  it  “con¬ 
tains”. 

The  Takens  time  delay  embedding  [19]  does 
not  preserve  topological  (as  opposed  to  metric  or 
geometric)  information.  We  have  observed  a 
three-dimensional  time  delay  embedding 
jc(i)— »y(i)  =  x(i  +  k),  x(i  +  2k)}  of  Be- 

lousov-Zhabotinsky  data  to  undergo  self-inter- 
sections  as  the  time  delay,  k,  is  increased.  Such 
self-intersections  show  that  relative  rotation  rates 
and  linking  numbers  are  not  invariant,  but  rather 
depend  on  the  time  delay. 

We  prefer  instead  an  embedding  which  we  call 
a  differential  phase  space  embedding:  jr(/)— »y(i) 
=  {jc(i),  dj:(/)/dr  ~  jc(/  +  1)  -  x(i  -  1),  d^jc(t)/ 
dt^  ~  x(i  +  1)  -  2x(i)  +  x(i  -  1)}.  This  particu¬ 
lar  embedding  can  be  regarded  as  an  affine 
transformation  of  a  Takens  embedding  with 
minimum  delay. 

We  prefer  differential  phase  space  embeddings 
for  two  reasons.  First,  the  variables  x,  dx/dt. 


d’jc/dr^  of  this  embedding  are  natural  variables 
to  use  when  attempting  to  model  the  dynamics. 
Second,  linking  properties  of  periodic  orbits  can 
be  determined  by  inspection  at  a  transverse 
crossing  (if  crossings  are  not  transverse,  this  does 
not  provide  an  embedding).  The  tangent  at  a 
crossing  (fig.  5a)  is  given  by 

dx'  ^  dx’Idt  ^ 
dx  dxidt  x' 

x"  =  jc'  X  slope  ,  (3) 

Therefore,  the  acceleration  is  proportional  to  the 
slope  at  a  crossing  point.  As  a  result,  at  crossings 
with  jr'  >  0  the  segment  with  the  larger  (smaller) 
slope  is  over  (under)  the  other,  with  the  reverse 
situation  when  jr'  <  0.  This  means  that  all  linking 
numbers  can  be  computed  by  inspection. 

A  differential  phase  space  embedding  of  the 
Belousov-Zhabotinsky  data  proved  inadequate 
(fig.  6a).  All  crossings  occurred  in  a  single  region 
which  could  not  be  resolved  despite  the  fact  that 
the  data  set  is  essentially  noise  free.  The 
dynamic  range  of  the  digitized  data  was  lO'*. 
Taking  the  first  derivative  reduced  the  dynamic 
range  of  dxidt  to  10^.  A  second  difference  re¬ 
duced  the  dynamic  range  of  d^xldt^  to  10°,  which 
is  the  noise  level  of  the  digitized  data. 

As  a  result  we  employed  an  integral-differen¬ 
tial  filter  [10] 
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Fig.  5.  (a)  In  a  differential  phase  space  embedding  the  over-  and  under-crossings  at  a  point  of  transverse  crossing  can  be 
determined  from  the  slopes  of  the  segments  at  the  crossings,  (b)  A  simple  electronic  circuit  which  generates  on  line  a  three 
dimensional  differential-integral  phase  space  embedding. 


40-*  j'O)  =  {>',(0  =  S  ^(y) 

/<! 

>’2(0  =  x{i),  yjO)  =  x(i  +  1)  +  x(i  -  1)| .  (4) 

This  embedding  has  the  same  desirable  prop¬ 
erties  as  the  differential  phase  space  embedding 
when  T  is  large  (several  cycles).  The  decaying 
convolution  (e'^'”'^’’)  was  used  instead  of  a 
straight  integral  (yi(/)  =  E  [jr(/)  -  Jt])  to  remove 
low  frequency  drift  which  was  present  in  the  data 
and  caused  drift  in  the  phase  space  plot  of  3^2 
against  y,.  This  embedding  is  useful  for  two 
practical  reasons.  First  it  can  be  implemented  on 
line  by  a  very  simple  electronic  circuit,  as  shown 


in  fig.  5b.  Second,  integrating  and  differentiating 
both  reduce  the  signal  to  noise  ratio  by  about  an 
order  of  magnitude,  while  differentiating  or  in¬ 
tegrating  twice  (^  times)  reduces  S/N  by  about 
two  (k)  orders  of  magnitude.  The  embedding  of 
the  Belousov-Zhabotinsky  data  with  this  filter  is 
shown  in  fig.  6b.  Since  the  strange  attractor  has  a 
hole  in  the  middle,  a  (many)  Poincare  sections 
could  be  defined.  The  return  map  on  several 
Poincare  sections  was  used  to  construct  a  consis¬ 
tent  symbolic  dynamics.  The  entire  data  set 
could  be  encoded  by  only  two  symbols:  0  (orien¬ 
tation  preserving)  and  1  (orientation  reversing). 
The  coding  of  this  data  set  was  independent  of 
Poincare  section  used.  This  embedding  was  used 


Fig.  6.  (a)  The  degeneracy  in  the  region  of  crossings  cannot  be  resolved  in  a  differential  (.r,  x’,  x")  phase  space  embedding  of  the 
Belousov-Zhabotiiisky  data,  (b)  A  differential-integral  phase  space  embedding  (4),  performed  with  the  filter  shown  in  fig.  5b. 
yields  a  nondegenerate  embedding  with  a  hole  in  the  center. 
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on  time  series  from  the  LSA  and  NMR  laser, 
with  very  similar  results. 

In  general,  constructing  a  consistent  symbolic 
dynamics  for  time  series  data  is  a  problem  of 
fundamental  importance  which  is  not  yet  solved. 
If  the  system  is  highly  dissipative,  as  are  the  BZ 
and  LSA  systems,  the  problem  is  accessible. 
Great  benefit  would  result  from  an  algorithmic 
solution  to  the  problem  of  assigning  a  consistent 
symbolic  dynamics  to  a  chaotic  time  series. 

5.  Relative  rotation  rates  and  linking  numbers 

Once  periodic  orbits  have  been  located  and  a 
three-dimensional  embedding  constructed,  the 
topological  properties  of  these  orbits  can  be 
determined. 

We  show  how  to  compute  linking  numbers  in 
fig.  7.  Surrogate  period  two  and  period  three 
orbits  are  shown  in  figs.  7a  and  7b.  Neither  orbit 
closes,  but  the  period  three  orbit  closes  within 
the  pixel  resolution  of  the  plot.  These  orbits  are 


(a)  (b) 


Fig.  7.  Embeddings  of  the  period  two  (a)  and  period  three 
(b)  orbits  in  the  Belousov-Zhabotinsky  data,  (c)  The  linking 
numbers  of  these  two  pteriodic  orbits  can  be  determined  by 
counting.  The  linking  number  is  (6  x  (  +  1)  +  2  x  (-l))/2  =  2. 


superposed  in  fig.  7c,  which  indicates  the  over 
and  under  crossings.  The  flow  direction  is  shown 
in  fig.  6b.  To  compute  the  linking  number  of 
these  two  orbits  we  construct  tangents  to  the 
curves  at  each  crossing  point.  The  tangent  to  the 
overcross,  the  tangent  to  the  undercross,  and  the 
unit  normal  to  the  projection  form  either  a  right 
handed  (  +  1)  or  left  handed  (-1)  coordinate 
system.  A  sign  (  +  1,-1)  is  assigned  to  each 
crossing  according  to  handedness.  The  linking 
number  of  these  two  orbits  is  half  the  sum  of 
signs,  summed  over  all  crossings  [6,  20].  Linking 
numbers  can  also  be  computed  by  carrying  out  a 
Gaussian  integral. 

Relative  rotation  rates  were  originally  de¬ 
veloped  to  characterize  driven  dynamical  systems 
[5].  These  topological  indices  (they  are  sets  of 
fractions)  describe  roughly  how  often  one  orbit 
rotates  around  another,  on  average.  More  speci¬ 
fically,  they  are  defined  as  follows.  Two  orbits  A 
and  B,  of  periods  and  p^,  intersect  a  Poincare 
section  in  p^  and  pg  points,  respectively.  A 
difference  vector  between  one  of  the  intersec¬ 
tions  of  A  and  one  of  the  intersections  of  B  with 
the  Poincare  section  is  then  propagated  forward 
in  time.  As  it  evolves,  this  difference  vector 
rotates  in  a  plane  transverse  to  the  propagation 
direction.  Eventually  it  will  return  to  its  initial 
position  (after  p^Pb  periods).  This  requires  an 
integer  rotation  through  2it  radians.  The  relative 
rotation  rate,  for  this  pair  of  initial  conditions,  is 
this  integer  divided  by  the  number  of  periods,  or 
average  rotation  per  period.  A  relative  rotation 
rate  can  be  computed  for  each  of  the  p^Pe  initial 
conditions.  These  fractions  may  not  be  the  same 
over  all  initial  conditions.  The  sets  of  linking 
numbers  and  relative  rotation  rates  for  periodic 
orbits  provide  a  unique  signature  for  the  under¬ 
lying  template  for  the  strange  attractor. 

Of  the  three  experimental  systems  studied, 
only  the  NMR  laser  is  periodically  driven.  The 
BZ  reaction  and  the  LSA  are  autonomous. 
Therefore  the  computations  of  relative  rotation 
rates,  which  were  developed  originally  for  the 
analysis  of  driven  dynamical  systems  [5],  pro- 
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ceeds  most  easily  for  the  NMR  laser.  We  show  in 
fig.  8  how  to  carry  out  the  computation  of  the 
self-relative  rotation  rate  for  a  period  4  orbit  in 
data  from  the  NMR  laser  [13],  which  is  a  driven 
dynamical  system.  Fig.  8a  shows  a  period  4  orbit, 
repeated  a  second  time.  Each  tick  represents  one 
driving  period.  The  discontinuity  at  the  repeti¬ 
tion  of  the  period  4  orbit  is  within  the  pixel 
resolution  of  the  plot.  Figs.  8b,  8c,  8d  show 
superpositions  in  the  x-t  plane  of  the  initial  orbit 
with  orbits  beginning  from  three  other  initial 
conditions  in  a  Poincare  section.  The  number  of 
crossings  is  4,  2,  4  respectively,  indicating  rota¬ 
tions  of  the  difference  vector  through  2(2ir), 
l(2Tr),  2(2Tr)  radians  and  relative  rotation  rates 
of  5 ,  5 ,  5 ,  respectively. 

for  autonomous  dynamical  systems  (Belousov- 
Zhabotinsky  reaction,  LSA).  If  one  can  trans¬ 
form  an  autonomous  dynamical  system  to  one 
which  is  effectively  driven,  for  example  by 
sweeping  a  Poincare  section  uniformly  about  an 
axis  through  a  hole  in  the  attractor,  or  by  carry¬ 
ing  out  a  Hilbert  transform,  then  the  computa¬ 


tion  becomes  as  simple  as  that  shown  above  for 
the  NMR  laser.  Whether  or  not  this  is  done,  the 
relative  rotation  rates  can  be  computed  by  count¬ 
ing  crossings.  We  have  developed  and  implemen¬ 
ted  an  algorithm  which  does  this  efficiently  [lOj. 
This  algorithm  was  used  to  determine  tables  of 
relative  rotation  rates  for  periodic  orbits  in  the 
BZ  data  [10]  and  the  LSA  data  [12]. 

We  show  in  table  1  the  (self-)  relative  rotation 
rates  for  all  orbits,  up  to  period  8,  extracted 
from  the  Belousov-Zhabotinsky  data  [10].  The 
linking  number  of  two  orbits  is  the  sum  of  the 
relative  rotation  rates  over  all  initial  conditions. 
This  is  true  also  of  self-linking  numbers.  Thus,  a 
table  of  linking  numbers  can  easily  be  con¬ 
structed  from  a  table  of  relative  rotation  rates.  A 
similar  table  was  constructed  for  the  NMR  laser 
[13].  Experiments  were  carried  out  on  the  LSA 
under  many  different  operating  conditions.  Dif¬ 
ferent  sets  of  periodic  orbits  were  found  under 
these  different  operating  condition,  and  for  each 
set  of  table  of  relative  rotation  rates  was  con¬ 
structed  [12]. 


Fig.  8.  (a)  A  period  4  orbit  from  the  NMR  laser,  repeated  a  second  time  (x  versus  t).  Each  tick  represents  a  full  period.  The  self 
relative  rotation  rates  of  this  orbit  can  be  determined  by  counting  the  crossings  of  this  orbit  when  started  from  each  initial 
condition  in  a  Poincare  section.  These  are:  (a)  0;  (b)  4;  (c)  2;  (d)  4,  for  a  spectrum:  ( 1)^  i,  0  of  self-relative  rotation  rates. 
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Table  1 

Relative  rotation  rates  for  periodic  orbits  extracted  from  the  Belousov-Zhabotinsky  data,  up  to  period  eight.  Each  orbit  is 


identified  by  its  symbolic  dynamics. 
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6.  Template  identification 

Birman  and  Williams  have  proved  a  remark¬ 
able  theorem  which  greatly  facilitates  the  analy¬ 
sis  of  dynamical  systems  which  exhibit  chaos 
[21-23]. 

This  theorem  states  that  for  a  dissipative 
three-dimensional  dynamical  system  which  ex¬ 
hibits  chaos  and  has  a  hyperbolic  invariant  set,  it 
is  possible  to  project  all  periodic  orbits  onto  an 
unstable  invariant  manifold  in  the  direction  of 
the  stable  foliation  without  incurring  crossings. 


This  means  that  the  topological  organization  of 
the  unstable  periodic  orbits  is  not  changed  by  the 
projection.  This  projection  allows  us  to  replace 
the  flaky  fractal  structure  of  a  strange  attractor 
by  a  branched  two-dimensional  manifold.  The 
theorem  tells  us  that  the  stable  direction  is  not 
important  in  understanding  the  dynamics  of  the 
flow.  Examples  of  branched  manifolds  for  four 
standard  flows  are  shown  in  fig.  9. 

It  might  appear  that  the  utility  of  the  Birman- 
Williams  theorem,  and  branched  manifolds,  to 
physical  systems  is  restricted.  For  example,  the 
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Fig.  9.  Templates  and  their  classification  by  integers  for  four  common  nonlinear  dynamical  systems,  (a)  Rossler  equations;  (b) 
Lorenz  equations  with  large  r  (>150);  (c)  van  der  Pol  equations;  (d)  Duffing  equation  before  homoclinic  tangency.  The  return 
flow  is  shown  for  the  Rossler  template  (a);  only  the  stretching  and  compressing  parts  of  the  remaining  templates  are  shown  for  the 
other  three  flows. 
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theorem  is  stated  for  hyperbolic  invariant  sets, 
which  are  typically  not  seen  in  experimental 
systems.  The  ptoint  is  that  the  linking  numbers  of 
orbits  which  exist  in  this  limit  must  remain  un¬ 
changed  as  long  as  these  orbits  exist,  when  one 
makes  excursions  away  from  the  hyperbolic  limit 
and  other  orbits  are  “pruned”  away  [10],  This  is 
the  situation  we  have  always  encountered  in 
experimental  data  sets  [12, 13]. 

It  also  appears  that  the  Birman-Williams 
theorem  is  a  strictly  three-dimensional  theorem. 
However,  it  remains  true  for  n-dimensional  dis¬ 
sipative  systems  providing  these  systems  have 
only  one  unstable  direction  and  are  strongly 
dissipative  [24].  In  terms  of  the  ordered  eigen¬ 
values  A|  >  Aj  >  Aj  >  •  •  •  A„  this  means  A  =  0 
(only  one  unstable  direction)  and  A|<|A,1,<  = 
3,4,...  (strongly  dissipative).  This  means  that 
the  Birman-Williams  theorem  can  be  applied  to 
systems  with  a  strange  attractor  whose  Hausdorff 
dimension  d  is  less  than  3  (by  the  Kaplan- Yorke 
conjecture  [25],  =  2  +  A,/|  A,!  <  3).  In  essence, 

the  stable  directions  are  not  important  in  de¬ 
termining  the  dynamics  of  a  strongly  contracting 
flow  with  one  unstable  direction. 

The  projection  of  a  strange  attractor  to  a 
branched  manifold  and  the  identification  of 
periodic  orbits  in  the  strange  attractor  with  those 
in  the  branched  manifold  has  the  following  use¬ 
ful  result.  It  is  possible  to  compute  the  topo¬ 
logical  organization  of  the  periodic  orbits  on  the 
template  rather  easily.  Since  a  transverse  section 
of  the  template  provides  a  one-dimensional  re¬ 
turn  map,  only  the  kneading  theory  for  one- 
dimensional  return  maps  is  required  to  locate 
periodic  orbits  on  the  template  [26].  The  linking 
properties  of  these  orbits  is  then  determined  by 
the  organization  of  the  framed  braids  in  this 
template  [21,22].  This  organization  is  specified 
by  how  the  framed  braids  split  apart,  wrap 
around  each  other,  and  are  joined  together. 

This  organization  is  completely  determined  by 
a  set  of  integers  organized  into  an  n  x  n  symmet¬ 
ric  integer  valued  matrix  and  a  1  x  «  integer 
valued  array,  where  n  is  the  number  of  branches 


in  the  template  ]9].  The  matrix  is  constructed 
from  period  one  orbits.  Within  each  branch  of 
the  template  is  exactly  one  period  one  orbit.  The 
diagonal  matrix  element,  T(i.  /),  is  the  local 
torsion  of  the  unique  period  one  orbit  in  the  /th 
branch,  measured  in  units  of  tt.  The  off  diagonal 
elements.  T{i.  j)  =  T{j,  /  ).  are  twice  the  linking 
number  of  the  period  one  orbits  in  the  /th  and 
yth  branches  of  the  template.  These  integers  can 
also  be  determined  by  counting  signed  crossings; 
of  the  /th  and  yth  period  one  orbits  (T(/,  /)),  and 
of  the  boundaries  of  the  /th  branch  (Tf/.  /)).  The 
array  information.  /!(/),  indicates  the  order  in 
which  the  branches  are  glued  together  at  the 
junction.  The  larger  the  value  of  A(i).  the  nearer 
to  the  front  is  the  /th  branch. 

A  simple  counting  procedure  has  been  en¬ 
coded  to  count  crossings  for  any  orbits  on  any 
template  [27].  There  are  two  types  of  input, 

(1)  a  classification  of  template,  by  integers, 

(2)  a  list  of  orbits,  by  symbolic  dynamics. 

The  two  outputs  are 

(A)  a  table  of  relative  rotation  rates, 

(B)  a  table  of  linking  numbers. 

Thus,  given  a  template,  it  is  possible  to  com¬ 
pute  tables  of  relative  rotation  rates  and  linking 
numbers  for  all  (or  any  subset  of)  periodic  orbits 
supported  by  that  knot-holder.  Conversely, 
given  a  table  of  relative  rotation  rates  or  linking 
numbers,  it  is  possible  to  use  the  subset  of  this 
table  restricted  to  the  period  one  and  period  two 
orbits  to  compute  the  integers  which  classify  the 
template:  the  template  matrix  and  array. 

For  the  Belousov-Zhabotinsky  data,  dynamics 
on  two  symbols  suffices.  This  means  that  the 
template  has  only  two  branches  and  can  be 
identified  from  only  three  orbits:  the  two  period 
one  orbits  0  and  1,  and  the  period  two  orbit  01. 
The  period  one  orbit  0  was  not  present  in  the 
data,  so  the  next  lowest  period  orbit  Oil  was 
used  to  provide  the  required  information.  These 
three  orbits  uniquely  identify  the  homology  of 
the  template.  This  template,  and  its  topological 
characterization,  is  shown  in  fig.  10  [10].  The 
flow  occurs  on  a  suspension  of  the  Smale  horse- 
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Fig.  10.  The  template  for  the  Belousov-Zhabotinsky  data  is 
a  zero  torsion  lift  of  a  Smale  horseshoe.  The  LSA  and  NMR 
laser  are  described  by  the  same  template.  The  flow  is  re¬ 
stricted  to  a  subset  of  the  template.  The  subset  has  a  fractal 
structure,  since  it  must  ex<  !ude  all  pre-images  of  certain 
regions. 

shoe  with  zero  global  torsion.  Since  not  all  of  the 
orbits  allowed  in  the  hyperbolic  limit  are  seen  in 
the  data,  the  flow  is  restricted  to  a  subset  of  this 
template. 

For  the  LSA  data  and  the  NMR  laser  data  two 
symbols  also  suffice.  In  both  cases  the  template 
was  identified  as  a  Smale  horseshoe  with  zero 
global  torsion  by  using  suitable  subsets  of  un¬ 
stable  periodic  orbits. 

7.  Template  verification 

Once  a  template  has  been  identified,  it  can  be 
used  to  predict  the  topological  invariants,  the 
relative  rotation  rates  and  linking  numbers,  of  all 
periodic  orbits  which  are  supported  by  that  knot 
holder.  These  invariants  are  available  indepen¬ 
dently  from  the  periodic  orbits  extracted  from 
the  chaotic  time  series.  Comparison  of  the  pre¬ 
dicted  (from  the  template)  and  measured  (from 
the  time  series)  topological  invariants  provides 
an  independent  check  on  the  validity  of  the 


initial  template  identification.  If  the  predicted 
and  measured  matrices  of  topological  invariants 
are  identical,  we  have  added  confidence  that  the 
initial  template  identification  was  correct.  If  they 
are  not  identical,  the  initial  template  identifica¬ 
tion  was  not  correct. 

The  template  underlying  the  Belousov- 
Zhabotinskii  chaofic  data  was  determined  on  the 
basis  of  the  relative  rotation  rates  of  the  three 
orbits:  1,  01,  Oil.  shown  in  table  1.  Using  this 
template,  the  relative  rotation  rates  of  the  re¬ 
maining  periodic  orbits  extracted  from  this  data 
set,  up  to  period  eight,  were  computed.  There 
were  no  differences  between  the  relative  rotation 
rates  computed  from  the  orbits  extracted  from 
the  data  and  the  corresponding  orbits  predicted 
from  the  template  [10].  The  templates  identified 
for  the  LSA  and  NMR  laser  were  verified  in 
exactly  the  same  way. 

Independent  checks  on  the  results  of  an  analy¬ 
sis  of  data  are  essential  in  order  to  provide 
confidence  that  the  conclusions  drawn  are  cor¬ 
rect.  The  topological  method  for  analyzing  time 
series  data  provides  such  independent  verifica¬ 
tion  procedures,  while  the  metric  approach  does 
not. 

8.  Model  the  dynamics 

A  template  provides  a  qualitative  model  for 
the  flow  which  generates  chaotic  time  series. 

We  have  developed  a  simple  computer  model 
for  a  flow  on  a  horseshoe  template.  This  model 
generates  chaotic  time  series  similar  in  form  to 
the  initial  time  series.  In  fact,  the  two  time  series 
have  identical  topological  properties  but  not 
necessarily  identical  metric  properties. 

The  idea  behind  this  modeling  effort  is  illus¬ 
trated  in  fig.  11.  A  three  dimensional  structure  is 
devised  whose  horizontal  cross  section  is  a  “thick 
spiral”.  Points  in  this  spiral  are  measured  by 
their  angular  displacement  from  some  fixed  line 
and  the  radial  distance  in  the  transverse  direc¬ 
tion.  The  coordinates  of  a  point  in  the  spiral  in 
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Fig.  11.  A  template  provides  a  geometric  model  of  the  flow. 
In  the  spiral  template  the  flow  stretches  in  the  angular 
direction  between  the  planes  z  =  0  and  z  =  1 ,  and  is  com¬ 
pressed  to  fit  within  the  part  of  the  spiral  between  0  and  ir 
between  the  planes  z  =  1  and  z  =  2.  The  flow  is  then  reinject¬ 
ed  into  the  z  =  0  plane  from  the  z  =  2  plane. 


the  top  (z  =  0)  plane  are  restricted  to  be  in  the 
‘branch’  The  stretching  phase  of  the 

evolution  occurs  in  the  range  0<z<l.  In  this 
phase,  the  angular  value  of  the  phase  space  point 
increases  uniformly  like  0(0)  d{z)  =  0(0)  + 
z(6t,  +  0(0)  ( A- 1)).  The  two  parameters,  0^ 
(drift)  and  A  (Lyapunov  exponent)  are  ‘unfold¬ 
ing’  parameters  for  the  stretching  process.  The 
squeezing  phase  is  carried  out  between  the  z  =  1 
and  z  =  2  planes  by  deforming  a  circle  containing 
the  spiral  into  a  shape  which  hts  into  the  spiral 
between  the  angular  variables  0  s  e^,  0^  <  n.  The 
angles  0o,  0,  are  “pruning  parameters’’.  Reinjec¬ 
tion  occurs  by  identifying  the  z  =  2  plane  with 
the  z  0  plane.  As  the  control  (unfolding  and 


pruning)  parameters  are  varied,  the  flow  is  re¬ 
stricted  to  different  parts  of  this  structure. 

This  model  seems  to  describe  the  properties  of 
many  periodically  driven  nonlinear  oscillators  at 
a  qualitative  level.  For  example,  for  fixed 
Lyapunov  exponent  A  (roughly  comparable  to 
the  forcing  strength  of  the  driving  term),  as  the 
drift  parameter  0p  (roughly  comparable  to  the 
period,  T,  of  the  driving  term)  increases,  there  is 
a  sequence  of  direct  and  inverse  saddle  node 
bifurcations  of  period  one  orbits.  Each  succes¬ 
sive  bifurcation  creates  a  saddle  and  node  with 
global  torsion  n,  the  node  becomes  a  flip  saddle, 
initiates  a  period  doubling  cascade  and  chaotic 
behavior.  This  behavior  then  reverses  itself  (bub¬ 
ble  formation),  the  cascade  reverses  itself,  then 
the  flip  saddle  becomes  a  node  again,  but  this 
time  with  a  global  torsion  increased  by  1  (as  the 
eigenvalue  circles  the  origin  in  the  complex 
plane).  This  node  then  annihilates  with  the  sad¬ 
dle  having  global  torsion  n  +  1  which  has  mean¬ 
while  been  created  in  the  succeeding  saddle  node 
bifurcation.  The  spiral  template  accounts  very 
well  for  the  systematic  increase  in  the  global 
torsion  of  the  period  one  orbits  alternatively 
created  and  destroyed  in  saddle  node  bifurca¬ 
tions,  which  has  been  observed  in  many  driven 
nonlinear  oscillators  (28,  29]. 

9.  Model  testing 

Topological  analysis  allows  the  possibility  of 
comparing  a  model  with  data  it  purports  to 
describe,  and  determining  whether  the  model  is 
not  consistent  with  the  data  or  it  is  consistent 
(more  correctly,  is  not  inconsistent)  with  the 
data. 

The  procedure  is  simple.  A  template  is  con¬ 
structed  from  the  experimental  data  as  outlined 
above.  The  model  is  then  used  to  generate  a  data 
set  for  the  ‘experimentally  observed  variable’.  A 
template  is  constructed  from  this  model  gener¬ 
ated  data  using  exactly  the  same  steps  as  for  the 
experimental  data.  If  the  two  templates  are  dif- 
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ferent,  the  model  is  not  consistent  with  the  data. 
Otherwise,  the  model  is  not  inconsistent  with  the 
data. 

We  note  that  templates  have  handedness. 
Therefore,  if  data  and  model  both  generate  zero 
global  torsion  suspensions  of  the  Smale  horse¬ 
shoe,  but  with  opposite  handedness,  the  model 
can  be  rejected. 

This  procedure  has  been  used  to  compare 
three-,  four-,  and  five-dimensional  models  of  the 
LSA  with  experimental  data  from  the  LSA  [12]. 
Each  of  the  models  analyzed  was  compatible 
with  the  data. 


10.  Conclusions 

We  have  developed  a  topological  procedure 
for  analyzing  chaotic  time  series.  This  procedure 
has  been  applied  to  experimental  data  sets  from 
the  Belousov- Zhabotinsky  reaction  [10],  the 
laser  with  saturable  absorber  [12],  and  the  NMR 
laser  [13]. 

In  this  procedure  periodic  orbits  are  extracted 
from  the  chaotic  time  series  by  the  method  of 
close  returns.  They,  and  the  strange  attractor, 
are  embedded  in  a  three-dimensional  phase 
space  using  a  differential  phase  space  embed¬ 
ding.  The  linking  numbers  and  relative  rotation 
rates  of  these  periodic  orbits  are  determined.  A 
subset  is  used  in  order  to  identify  the  underlying 
template  or  knot  holder  which  supports  the 
strange  attractor  and  all  the  periodic  orbits  in  its 
neighborhood.  This  template  is  then  used  to 
compute  the  topological  invariants  for  all  orbits 
and  orbit  pairs  which  exist  in  the  knot  holder.  A 
comparison  of  these  topological  invariants  mea¬ 
sured  from  the  data  (reconstructed  periodic  or¬ 
bits)  and  those  predicted  from  the  template  pro¬ 
vides  added  confidence  that  the  initial  template 
identification  was  correct,  or  else  that  it  was  not 
correct.  The  template  itself  provides  a  model  for 
the  dynamics -the  stretching  and  squeezing 
mechanisms  -  responsible  for  generating  the  cha¬ 
otic  time  series  data. 


This  procedure  is  useful  for  data  generated  by 
low-dimensional  dynamical  systems.  By  "low  di¬ 
mensional”  we  mean  n-dimensional  dynamical 
systems  (n  arbitrary,  n  >  3)  which  are  strongly 
contracting  with  only  one  unstable  direction. 
These  have  strange  attractors  with  Hausdorff 
dimension  less  than  three.  This  is  not  unreason¬ 
able,  since  the  whole  procedure  depends  on 
constructing  a  branched  two-dimensional  man¬ 
ifold  in  R’  from  a  set  of  topological  invariants 
which  can  only  be  defined  in  It  is  remarkable 
that  this  procedure  is  not  limited  to  three-dimen¬ 
sional  dynamical  systems.  For  example,  the  Be- 
lousov-Zhabotinsky  reaction  has  been  modeled 
as  a  dynamical  system  ranging  in  dimension  from 
30  down  to  five  [30].  There  are  no  three-dimen¬ 
sional  models  for  this  dynamics.  Nevertheless, 
we  have  constructed  a  good  three-dimensional 
embedding,  which  suggests  that  this  embedding 
could  be  used  to  develop  a  three-dimensional 
model  of  this  system.  As  another  example,  this 
analysis  has  been  used  to  test  three-,  four-,  and 
five-dimensional  models  of  the  LSA  against  data 
generated  by  the  LSA  [12]. 

The  three  experimental  systems  analyzed  so 
far  all  lead  to  the  same  template,  a  zero  global 
torsion  suspension  of  the  Smale  horseshoe.  That 
the  horseshoe  appears  so  often  in  physical  sys¬ 
tems  is  not  surprising,  since  it  describes  the 
chaotic  dynamics  generated  by  a  simple  homo¬ 
clinic  tangency.  For  such  simple  dynamics  much 
information  is  already  conveyed  by  a  return 
map.  So  what  do  we  learn  from  a  topological 
analysis  which  generates  a  template  that  we  do 
not  already  know  from  a  return  map?  The  topo¬ 
logical  analysis  can  distinguish  handedness  (cf. 
section  9)  as  well  as  global  torsion,  neither  of 
which  can  be  determined  from  a  return  map. 
Both  can  be  used  to  compare  models  with  data 
(section  9).  The  creation  of  bifurcation  bubbles 
[28, 29]  in  driven  nonlinear  oscillators  which  is 
caused  by  the  systematic  change  in  global  torsion 
is  evident  in  a  template  (cf.  section  8).  but  less  so 
in  a  return  map.  More  complicated  templates,  as 
occur  for  the  van  der  Pol  and  the  Duffing  oscil- 
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lators,  convey  much  more  information  than  a 
return  map  and  facilitate  analysis  of  what  hap¬ 
pens  when  the  symmetry  of  the  Duffing  oscillator 
is  broken.  Further,  a  template  provides  a  model 
for  the  flow  dynamics  in  a  way  that  a  return  map 
cannot. 

This  topological  procedure  for  analyzing  cha¬ 
otic  data  can  be  applied  to  relatively  small  data 
sets,  degrades  gracefully  with  noise,  is  falsifiable, 
provides  a  model  for  the  dynamics,  and  can  be 
used  to  compare  models  with  experimental  data. 
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We  describe  work  in  progress  on  using  time  series  data  output  from  dynamical  systems  to  determine  information  about 
phase  manifolds.  Purely  from  estimates  of  the  probability  density  of  observations,  it  turns  out  to  be  possible  in  principle  to 
determine  the  dimension  and  genus  of  the  manifold.  We  also  show  experimental  evidence  that  our  methods  may  be  useful 
for  fractal  attractors  which  have  nearly  integer  dimension  and  are  well-approximated  by  smooth  objects  such  as  smooth 
manifolds  with  boundary  or  branched  manifolds.  Our  methods  do  not  use  embedding  and  do  not  require  knowledge  of 
dimensions  or  choice  of  time  delays  or  projections. 


1.  Introduction 

Often,  data  from  a  dynamical  process  is  only 
available  as  a  single  scalar  measurement  even 
although  the  phase  space  of  the  system  is  high 
dimensional  and  may  be  non-Euclidean.  The 
standard  way  to  handle  this  problem  is  to  use  the 
embedding  theorem  [1-4],  and  this  very  power¬ 
ful  result  has  revolutionised  nonlinear  data  anal¬ 
ysis,  allowing  the  calculation  of  statistics  such  as 
fractal  dimension  [5-7]  as  well  as  the  building  of 
dynamical  models  [8-12]. 

A  problem  that  arises  whenever  embedding  is 
used  is  how  to  construct  a  good  embedding.  It 
would  be  useful  to  know  in  advance  the  dimen¬ 
sion  of  the  phase  space’'''  for  the  dynamics;  other 
information,  such  as  manifold  type,  would  be 
useful  in  construction  of  qualitative  (say, 
geometrical)  models  of  the  action  of  the 

'On  leave  from  The  University  of  Western  Australia. 

We  use  the  terms  “phase  space”,  “phase  manifold”, 
“state  space”  and  “state  manifold”  interchangeably. 


dynamics  which  may  sometimes  be  more  infor¬ 
mative  than  black-box  quantitative  models. 

In  this  paper  we  use  an  approach  which  is 
independent  of  embedding  and  does  not  require 
choice  of  embedding  dimensions,  lags,  projec¬ 
tions  and  so  on.  It  gives  information  about  the 
manifold  type  if  the  dimension  is  known,  and  in 
principle  can  give  information  about  the  dimen¬ 
sion  too.  The  paper  is  to  an  extent  a  report  on 
work  in  progress,  since  the  proofs  are  fairly 
complex  as  well  as  new  and  it  is  not  yet  clear 
whether  one  can  improve  the  practical  usefulness 
of  the  approach;  consequently,  we  concentrate 
here  on  noise-free  data  available  in  reasonable 
quantities.  We  do,  however,  show  some  evidence 
from  computer  experiments  that  the  method 
works  well  in  the  case  for  which  the  theorem  was 
originally  designed,  namely  compact  smooth 
manifolds,  and  may  be  extensible  to  other  cases, 
including  certain  fractal  objects. 

The  long-term  aim  is  to  try  to  find  dynamical 
signatures,  which  allow  a  (crude)  classification  of 
systems  in  a  simple  way.  The  approach  is  com- 
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plementary  to  embedding  approaches  insofar  as 
it  provides  separate  evidence  that  the  phase 
space  has  been  correctly  identified,  and  in  par¬ 
ticular  it  is  complementary  to  other  attempts  to 
identify  manifolds  using  embedding  plus  triangu¬ 
lation  [12]. 

In  the  rest  of  the  paper  we  describe  the  as¬ 
sumptions  we  make  about  the  dynamical  system 
and  the  time  series  of  measurements,  and  state 
the  basic  theorem  in  the  case  of  compact  orient- 
able  two-manifolds.  Then  we  discuss  some  of  the 
background  and  give  a  brief  plausibility  argu¬ 
ment  for  the  theorem.  Following  this  are  some 
examples  of  the  theorem  in  use.  Finally,  we 
show  some  experiments  on  attractors  which  are 
approximately  manifolds,  although  with  features 
that  go  beyond  those  in  the  theorem. 

2.  Manifolds  from  data 

In  this  section  we  outline  recent  work  by 
Noakes  [13]  which  makes  it  possible  in  some 
circumstances  to  identify  manifold  type  and  di¬ 
mension  from  data.  The  approach  is  to  regard  a 
measurement  from  a  dynamical  system  as  a  func¬ 
tion  of  the  state,  and  to  assume  the  function  is 
well-behaved.  If  we  have  a  trajectory,  or  a  set  of 
trajectories,  such  that  the  collection  of  all  the 
states  covers  the  phase  manifold  uniformly,  then, 
as  is  known  from  embedding  theory,  well- 
behaved  functions  of  the  states  preserve  certain 
information.  Surprisingly,  some  of  the  informa¬ 
tion  -  in  particular,  the  manifold  type  and  di¬ 
mension  -  survives  even  drastic  surgery  such  as 
replacement  of  the  time  series  by  a  density  es¬ 
timator. 

A  time  series  on  a  smooth  manifold  M  (without 
boundary)  is  a  sequence  5={x„:n>l,jc„E 
M}.  Let  f:  M—*U  be  a  smooth  map  and  write  T 
for  the  time  series  {y„=f(x„):n^\}  on  R. 
Suppose  we  have  a  situation  where  M  (and 
therefore  /)  and  also  5  is  unknown.  Can  we  say 
anything  about  the  geometry  of  M  by  observing 
T1 


Of  course,  when  /is  constant  we  cannot.  But 
most  maps  are  not  like  that:  in  fact,  an  open 
dense  set  of  smooth  maps  /is  restricted  Morse  in 
the  sense  explained  below.  For  the  purpose  of 
this  paper  we  are  going  to  eonsider  only  orient- 
able  two-manifolds;  this  makes  the  explanations 
easier  (we  can  imagine  M  as  a  surface)  and,  more 
importantly,  ensures  that  certain  features  are 
easy  to  identify  with  data  sets  of  reasonable  size. 

Definitions. 

( 1 )  Call  Jt,  E  M  regular  when  there  is  a  system 
(u,  v)  of  local  coordinates  near  x,  in  which  / 
takes  the  form  /(«,  v)  =  u.  When  x  is  not  regular 
call  it  a  critical  point  of  /. 

(2)  Call  x-E.  M  a  nondegenerate  local  mini¬ 
mum  when  there  is  a  system  (m,  u)  of  local 
coordinates  near  x,,  in  which  /  takes  the  form 

f(u,  u)  =  u'  v~  . 

Call  XjE  M  a  nondegenerate  local  maximum 
when  there  is  a  system  (m,  v)  of  local  coordinates 
near  x,  in  which  /  takes  the  form 

/(m,  v)  =  -u^  -  V'  . 

(3)  Say  that  x,  is  a  nondegenerate  saddle  when 
(m,  v)  can  be  chosen  so 

/(m,  v)  =  u^  -V- . 

We  call  /  restricted  Morse  when  it  is  one-to-one 
on  its  set  C  of  critical  points,  and  these  critical 
points  are  either  nondegenerate  local  maxima  or 
nondegenerate  local  minima  or  nondegenerate 
saddles. 

Let  /  be  restricted  Morse.  Then  /  has  only 
finitely  many  critical  points,  and  the  integer 

xiM)  =  #(local  minima)  -t-  #(local  maxima) 

-  #(saddles) ,  ( 1 ) 

called  the  Euler  characteristic,  is  independent  of 
/  and  depends  only  on  the  geometry  of  M.  If  M 
is  orientable  (which  we  assume),  x(M)  deter- 
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mines  M  up  to  smooth  homeomorphisms.  For 
example,  in  the  case  of  any  manifold  M  dif- 
feomorphic  to  a  sphere  it  turns  out  that  a'(^)  = 
2,  while  for  a  torus,  ;^(5'  x  5')  =  0.  In  fact,  for  a 
two-manifold,  1  -  is  the  number  of  handles. 

If  h:N-^M  is  a  smooth  homeomorphism, 
then  T  might  also  come  from  5  =  {jc„  = 
h~'x„:  «  >  1}  by  means  of  the  map  f^f°h.  So 
we  cannot  expect  to  be  able  to  determine  much 
more  about  M  from  the  time  series  than  the 
information  given  by  xi^)- 

This  problem,  and  others,  are  discussed  in  ref. 
[12]  in  greater  detail  in  the  context  of  manifold 
triangulation.  The  present  discussion  is  less  am¬ 
bitious  but  may  be  useful  in  cases  where  triangu¬ 
lation  methods  fail  for  any  of  a  number  of 
reasons;  it  is  also  interesting  as  a  method  in  its 
own  right,  since  the  approach  appears  to  be 
novel,  and  may  be  extensible.  One  limitation  in 
the  present  discussion  is  that  5  should  be  more 
or  less  uniform  with  respect  to  Lebesgue  mea¬ 
sure;  we  accept  that  measures  supported  by  frac¬ 
tal  sets  are  needed  in  some  applications,  and  we 
discuss  this  later.  A  less  essential  limitation  is 
that  we  restrict  ourselves  here  mainly  to  two 
dimensional  manifolds. 

Although  the  time  series  T  does  not  give  much 
information  about  M,  we  are  actually  going  to 
throw  some  away.  We  forget  about  the  ordering 
of  T,  so  that  it  becomes  an  unordered  set  of  real 
numbers.  On  the  face  of  it,  there  is  not  a  lot  we 
can  do  now  except  perhaps  construct  a  histogram 
or  other  density  estimator  for  T.  For  this  paper, 
we  are  only  going  to  look  at  histograms  and  a 
closely  related  estimator. 

Suppose  that  the  x„  are  realisations  of  a  ran¬ 
dom  variable  X  described  by  a  never-zero 
smooth  density  f  on  M  with  respect  to  a  never- 
zero  smooth  area  form  fi.  Then  y„  =  f{x„)  is  a 
realization  of  the  random  variable  Y  =  f(X).  It 
turns  out  that  Y  has  a  probability  density  func¬ 
tion  g,  for  which  we  can  find  an  estimate  g  from 
the  data  {y„}.  The  following  result  on  the  exist¬ 
ence  of  g  is  nontrivial  and  is  proved  in  ref.  [13] 
using  techniques  from  differential  geometry. 


Theorem  I.  Let  f:  M^Ube  restricted  Morse. 
There  is  a  smooth  density  g  for  Y  =  f{X)  with 
the  following  properties: 

(1)  g  is  defined  over  f{M)  -  f{C)\ 

(2)  if  2„  is  the  image  of  a  point  of  local 
minimum  of  /  then  is  a  left  cliff  of  g. 
namely 

lim  g(2)>  lim  g(z)  , 

and  if  z,,  is  the  image  of  a  point  of  local 
maximum  of  /  then  z,,  is  a  right  cliff  of  g, 
namely 

lim  g(z)<  lim  g(z); 

(3)  if  z„  is  the  image  of  a  saddle  of  /then  z,,  is 
a  splinter  of  g,  namely 

giz)  =  -0(ln|z  -  zj) 

for  z  near  z,,. 

An  idea  of  why  the  result  holds  can  be  got 
from  thinking  about  how  densities  project.  Near 
a  point  Jt,„,  of  local  minimum  or  a  restricted 
Morse  function  /  takes  the  form  f{u,  u)  = 
+  W  +  V'  where  u,  v  are  local  coordinates 
in  a  neighbourhood  of  x,„,.  Write  r'  =  u~  +  v~. 
When  y  >  0  the  infinitesimal  contribution  to  the 
probability  of  T  =  f{X)  from  this  neighbourhood 
is  more  or  less  2Trr  dr  =  tt  d(r")  = -it  dT  and  so 
the  contribution  to  the  density  is  tt.  On  the  other 
hand,  when  T  <  0  the  neighbourhood  contri¬ 
butes  nothing  to  the  probability  of  Y.  So  we  have 
IT  on  the  right  and  0  on  the  left,  added  to  which 
there  may  be  a  smooth  contribution  from  regions 
outside  the  neighbourhood.  The  result  is  a  cliff. 
The  argument  generalises,  of  course:  for  exam¬ 
ple,  a  three-manifold  gets  a  contribution  of 
4TTr'  dr  =  2Trr  d(r“)  =  2'iTy '  ’  dV,  and  the  cliff  is 
replaced  by  the  positive  part  of  a  square  root 
function. 

So  if  we  plot  g,  the  features  characteristic  of 
local  maxima  and  minima  and  of  saddles  may 
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Fig.  1.  A  possible  density  for  a  two-torus  embedded  in  K’. 
Cliffs  are  marked  “c”  and  splinters  are  marked  “s”.  There 
are  as  many  cliffs  as  splinters,  implying  that  x  S')  =  0. 


signatures  of  extreme  points  and  saddles  become 
less  pronounced:  for  example,  we  indicated 
above  that  for  a  three-manifold  a  cliff  is  replaced 
by  a  square  root  function. 


3.  Applications 

We  first  describe  two  examples  of  direct  use  of 
the  theorem;  for  more  details  and  other  exam¬ 
ples,  see  ref.  [13].  Then  we  consider  how  to 
better  estimate  the  features  like  cliffs  and  splin¬ 
ters  that  we  are  looking  for,  and  apply  a  modi¬ 
fied  estimator  to  a  quasiperiodic  signal  to  reveal 
the  underlying  torus. 

3.1.  Example 

We  distribute  100000  points  (jc,,  Jt,,  x,, 
uniformly  over  the  torus  5 '  x  5 '  in  by  setting 

(jc,,  x,,  Xy,  jTj)  =  (cos  0.  sin  0,  cos  <f).  sin  <!>) 


stand  out.  By  counting  their  numbers  we  should 
be  able  to  estimate  x{^)-  Code  for  finding  a 
standard  equal-width  bin  histogram  is  readily 
available,  or  is  easy  to  write.  The  time  required 
for  the  calculation  is,  of  course,  very  short.  Fig. 
1  shows  an  example  for  a  torus,  where  the  upper 
part  of  the  picture  is  a  two  dimensional  projec¬ 
tion  of  a  torus  embedded  in  three  dimensions 
and  the  lower  part  is  the  one  dimensional  projec¬ 
tion  onto  the  axis  shown.  We  see  cliffs  corre¬ 
sponding  to  “outside  curves”  from  the  viewpoint 
implied  by  the  upper  picture,  and  splinters  corre¬ 
sponding  to  “inside  curves”.  The  numbers  of 
cliffs  and  splinters  are  the  same,  so  the  formula 
gives  ;^  =  0  as  we  would  expect.  Note  that  one 
way  in  which  genericity  is  important  is  that  we 
do  not  want  a  cliff  and  a  splinter  (for  example) 
to  coincide  in  the  one  dimensional  projection. 

There  are  analogous,  but  less  informative, 
procedures  for  higher  dimensional  manifolds. 
The  reason  they  are  less  informative  is  that  the 


for  0  and  4>  distributed  independently  on  (0,  2ir). 
We  define  the  observation  function  /  by 

/(x,,  x,,  Xy,  xj  =  (x,  -  1.8)'  -I-  (x,  -  3.7)‘  . 

The  critical  points  of  /can  be  calculated  direct¬ 
ly  and  occur  at 


(1,0,  1,0) 
(-1,0, -1,0) 
(1,0,  -1,0) 
(-1,0, 1,0) 


(minimum) 

(maximum) 

(saddle) 

(saddle) 


value  7.93  , 
value  29.93 , 
value  22.73  , 
value  15.13  . 


A  histogram  of  the  values  of  y  =/(x,,  x,,  x,.  X4) 
is  shown  in  fig.  2.  There  are  two  cliffs  and  two 
splinters,  where  the  Theorem  says  they  should 
be.  This  is  also  in  agreement  with  eq.  (1)  since 
;^(5‘x5')  =  0.  The  features  are  clear  with  a 
large  sample  (100  000)  and  are  still  visible  with  a 
moderate  sample  (50(X)). 
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Fig.  2.  Density  plots  (100  bins)  for  samples  from  real-valued 
observations  of  a  uniform  distribution  on  a  two-torus  embed¬ 
ded  in  R*.  Solid  line;  100000  samples.  Dotted  line:  5000 
samples.  There  are  two  cliffs  and  two  splinters,  evident  in  the 
100000  sample  case  and  visible  even  in  the  5000  sample  case. 

3.2.  Example 

We  take  100000  points  distributed  uniformly 
over  embedded  in  with  f{x^,X2,Xy)- 
(j:,  +  3)ix2  +  1)(JC3  +  2).  We  would  have  guessed 
from  formula  (1)  and  the  histogram  in 
fig.  3,  which  has  two  cliffs  and  no  splinters, 
implying  ;^(5^)  =  2. 

3.3.  Other  methods  of  density  estimation 

A  more  classically-based  approach  to  estimat¬ 
ing  the  phase  manifold  A/,  used  in  ref.  [12],  is  to 
use  the  ordering  of  T  to  define  an  embedding  of 
M  in  Euclidean  space,  and  then  triangulate  the 
resulting  cloud  of  points.  This  might  lead  to 
enhanced  estimates,  in  that  smaller  samples 
might  be  required  to  estimate  x{M)  accurately, 
but  it  may  also  require  a  lot  of  technique  to 
implement  successfully,  particularly  if  the  data  is 
corrupted  by  noise  or  is  inaccurate  for  other 
reasons.  Although  we  have  not  discussed  the 
effects  of  noise  on  the  approach  being  described 


sphere  density  estimate 

density  X  10'^ 

120.00  >-  r  ;  I 


Fig.  3.  Density  plot  (100  bins)  for  100  000  samples  from  a 
uniform  distribution  on  a  two-sphere  embedded  in  IR'.  The 
observation  function  is  (j;  +  3)(>’  +  l)(z  +  2).  There  are  two 
cliffs  and  no  splinters. 

here,  it  is  reasonable  to  expect  that  small  noise 
on  the  data  should  not  greatly  affect  the  appear¬ 
ance  of  the  histogram:  the  main  problem  will  be 
the  blurring  of  important  features,  which  the  risk 
that,  say,  a  quadratic  maximum  in  the  density 
function  becomes  indistinguishable  from  a 
splinter. 

Although  we  have  used  large  numbers  of 
points  in  the  examples  so  far,  cleverer  density 
estimation  techniques  could  perhaps  to  used  to 
reduce  the  size  of  time  series  required,  at  least  in 
the  low-noise  case.  We  have  not,  however,  had 
much  success  using  standard  smoothing  of  histo¬ 
grams;  nor  would  we  expect  to  have  much  suc¬ 
cess  using  smooth  kernel  approximation,  since 
both  of  these  techniques  are  intended  for  es¬ 
timating  smooth  densities  and  obscure  precisely 
the  features  we  wish  to  find. 

A  technique  which  may  be  better  adapted  to 
the  present  problem  is  to  try  to  fit  the  density 
more  carefully  in  regions  where  there  are  many 
data  points.  The  following  simple  method  first 
estimates  the  distribution  function  (i.e.,  the  inte- 
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gral  P(_y)  of  the  density).  To  do  so  it  sorts  the 
data  {y,}  into  increasing  order,  giving  a  series 
{s,},  which  gives  points  (5,,  tVn).  A  crude 
piecewise-linear  (discontinuous)  estimator  of  P  is 
obtained  by  fitting  straight  line  segments  to  sub¬ 
sets  of  points  {Si,iln):  for  various 

values  of  and  i^.  The  slopes  of  the  lines  are 
then  estimates  of  the  density.  We  have  found  this 
useful  with  subsets  ranging  in  size  from  50  to 
1000,  depending  on  the  number  of  data  points 
available. 

3.4.  Example 

A  sample  of  40  000  points  (sampled  with  time 
interval  0.4)  from  the  function 

y(l)  =  cos  t  +  sin  tot  (2) 

(where  w  =  exp(l))  was  used  as  input  to  this 
modified  density  estimator  with  100  bins  to  give 
the  solid  line  in  fig.  4.  We  do  not  need  such  a 


quasiperiodic  signal  density  estimate 

demily  X  tO*^ 


Fig.  4.  Modified  density  estimating  algorithm  applied  to 
40  000  data  values  from  the  quasiperiodic  waveform  defined 
in  eq.  (2).  The  phase  space  is  very  clearly  identified  as  a 
torus  by  the  two  cliffs  and  two  splinters  shown  on  the  solid 
line  (40  000  data  values)  and  the  features  are  also  visible  with 
only  1000  data  values  (dotted  line). 


huge  sample,  however:  the  features  are  still  ap¬ 
parent  from  the  dotted  line  estimate  which  used 
only  1000  points  and  subsets  of  size  40  (i.e., 
25  bins). 

The  technique  we  have  used  here  can  be  re¬ 
garded  as  a  variable-width  binning  method,  but 
it  is  sensitive  to  noise  and,  when  there  is  com¬ 
paratively  little  data,  to  subset  size;  it  should 
probably  be  used  in  conjunction  with  standard 
histograms.  It  is  certainly  worthwhile  trying  dif¬ 
ferent  numbers  of  bins:  for  example,  with  the 
torus  the  splinters  and  cliffs  stand  out  with  bins 
containing  very  small  numbers  of  points  (down 
to  2  or  so)  but  there  are  many  spurious  smaller 
peaks,  while  with  too  many  points  per  bin  the 
splinters  are  broadened  to  the  extent  that  they 
become  insignificant. 


4.  Application  to  non-manifolds 

Encouraged  by  our  success  with  two- 
manifolds,  we  now  investigate  empirically  how 
much  further  this  kind  of  analysis  can  be  pushed. 
Our  main  interest  is  in  dynamical  systems  where 
M  is  replaced  by  a  fractal  set  which  is  “thin”  in 
that  its  dimension  is  approximately  an  integer, 
and  that  it  is  well-approximated  by  a  manifold, 
possibly  with  branches  or  boundaries,  and  pos¬ 
sible  non-orientable.  (The  analysis  given  earlier 
can  be  extended  to  such  more  general  manifolds, 
but  our  interest  here  is  in  the  fractal  objects.) 

4.1.  Example 

We  take  9000  points  from  the  Rossler  attractor 
[14]  and  estimate  the  density.  The  resulting 
graph  in  fig.  5  has  many  of  the  features  of  the 
previous  examples,  especially  the  cliffs  at  either 
end  which  are  characteristic  of  dimension  (ap¬ 
proximately)  2.  Note  also  the  sharp  points  which 
resemble  the  splinters  observed  in  fig.  1.  The 
approximating  surface  is  made  by  gluing  a 
Mobius  band,  which  is  a  non-orientable  manifold 


L.  Noakes,  A.  Mees  /  Dynamical  signatures 


249 


Rossler  density  estimate 

dmiiy  x  l(r^ 


Fig.  5.  Modified  density  estimate  (100  bins)  for  9000  samples 
from  the  Rossler  attractor.  There  appear  to  be  cliffs  and 
perhaps  splinters. 


with  boundary,  to  an  annulus,  which  is  an  orient- 
able  manifold  with  boundary. 

4.2.  Example 

When  we  do  the  same  thing  with  9000  points 
from  the  Lorenz  attractor  [15]  we  obtain  fig.  6. 
Similar  comments  apply,  except  that  the  cliffs 
are  less  pronounced:  we  need  a  lot  more  data  to 
see  what  is  really  happening  at  the  edges,  and  in 
fact  we  expect  to  see  fractal  structure,  rather 
than  actual  cliffs.  Fractal  structure  at  the  edges  is 
one  of  the  features  which  will  show  up  in  a  better 
density  estimator:  there  are  already  indications 
in  fig.  6,  and  a  better  estimator  might  also  show 
such  structure  in  fig.  5.  We  are  currently  in 
investigating  whether  we  can  estimate  dimension 
from  such  fractal  structure. 

4.3.  Example 

To  convince  ourselves  that  we  are  not  doing 
something  trivial,  we  do  the  same  thing  with 
times  between  geiger  counter  detections  of 


Lorenz  density  estimate 
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Fig.  6.  Modified  density  estimate  (100  bins)  for  9000  samples 
from  the  Lorenz  attractor. 


radioactive  emissions  from  decay  of  radioactive 
cobalt.  Fig.  7  certainly  does  not  look  like  a 
histogram  of  data  collected  from  a  time  series 
whose  phase-manifold  is  two-dimensional,  and  of 


radioactive  decay  density  estimate 

density 


Fig,  7.  Modified  density  estimate  (100  bins)  for  21  000  inter¬ 
vals  between  detections  of  emissions  from  radioactive  cobalt. 
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course  it  is  not:  the  data  should  consist  of  realisa¬ 
tions  of  independent  identically  exponentially 
distributed  random  variables,  which  the  graph 
does  not  contradict. 


5.  Conclusions 

It  is  a  little  surprising  that  information  about 
the  nature  of  the  system’s  state  space  is  available 
directly  from  density  estimators  of  a  time  series 
or  other  sample.  With  the  tools  we  have  at 
present,  practical  use  is  limited,  and  the  density 
approach  can  best  be  regarded  as  complemen¬ 
tary  to  others,  but  further  developments  may 
give  rise  to  a  method  of  increased  scope  and 
robustness. 
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A  method  is  described  for  extracting  from  a  chaotic  time  series  a  system  of  equations  whose  solution  reproduces  the 
general  features  of  the  original  data  even  when  these  are  contaminated  with  noise.  The  equations  facilitate  calculation  of 
fractal  dimension,  Lyapunov  exptonents  and  short-term  predictions.  The  method  is  applied  to  data  derived  from  numerical 
solutions  of  the  logistic  equation,  the  Henon  equations  with  added  noise,  the  Lorenz  equations  and  the  Rossler  equations. 


1.  Introduction 

In  many  fields  of  science  one  measures  quan¬ 
tities  that  fluctuate  in  time  or  space  with  no 
discernible  pattern.  Examples  include  magnetic 
and  electric  fields  in  plasmas,  weather  and 
climatological  data,  variation  of  biological  popu¬ 
lations,  and  stock  prices.  It  has  been  generally 
assumed  that  such  situations  could  be  described 
by  a  large  number  of  deterministic  equations  or 
by  stochastic  ones.  More  recently  it  has  been 
appreciated  that  ordinary,  but  nonlinear,  dif¬ 
ferential  equations  with  as  few  as  three  degrees 
of  freedom  or  difference  equations  with  a  single 
degree  of  freedom  can  have  pseudo-random 
(chaotic)  solutions.  This  has  led  to  the  hope  that 
such  simple  systems  can  model  the  real  world. 

Ideally,  one  would  like  to  be  able  to  extract 
the  equations  from  a  fluctuating  time  series.  In 
the  absence  of  additional  information,  this  goal 
is  unrealistic.  The  variable  observed  may  not  be 
simply  related  to  the  fundamental  dynamical 
variables  of  the  system.  The  measurement  will 
be  contaminated  by  noise  and  round-off  errors 
and  limited  by  sample  rate  and  duration.  How¬ 


ever,  it  may  be  possible  to  find  a  system  of 
equations  which  mimic  the  general  features  such 
as  the  topology  in  a  suitable  phase  space,  and 
these  equations  might  shed  insight  into  the  be¬ 
havior  of  the  system. 

Here  we  present  a  method  for  extracting  from 
a  fluctuating  time  series  such  a  set  of  equations. 
These  model  equations  may  be  used  to  predict 
not  so  much  the  details  of  the  time  evolution, 
which  is  limited  by  sensitivity  to  initial  conditions 
in  chaotic  systems,  but  topological  changes  such 
as  the  change  of  periodic  behavior  through  a 
series  of  period-doubling  bifurcations.  Further¬ 
more,  because  the  model  equations  provide  in 
principle  an  unlimited  amount  of  data,  the  calcu¬ 
lation  of  fractal  dimension  [1]  and  Lyapunov 
exponents  [2]  is  much  simplified,  although  the 
relationship  between  such  calculated  quantities 
and  the  true  values  remains  an  intriguing  and 
open  question.  It  is  also  much  simpler  to  calcu¬ 
late  the  Lyapunov  exponents  directly  from  the 
equations  rather  than  from  the  data  [3,  4]. 

Of  course  there  have  been  previous  attempts 
to  extract  from  a  time  series  a  simple  set  of 
coupled  equations  whose  solution  gives  an  ap- 
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proximation  in  some  sense  to  the  original  data. 
Difference  equations  [5-7]  corresponding  to 
maps,  and  differential  equations  [8-10]  corre¬ 
sponding  to  flows  have  been  deduced.  Although 
the  method  described  here  has  features  common 
to  this  earlier  work,  the  novel  element  is  the  use 
of  singular  value  decomposition  to  choose  the 
appropriate  dependent  variables  that  appear  in 
the  dynamical  equations  rather  than  just  using 
the  data  and  its  derivatives.  An  added  advantage 
of  singular  value  decomposition  is  that  it  pro¬ 
vides  an  effinent  filter  for  the  noise  that  is  always 
present  in  experimental  data. 

2.  Numerical  procedure 

Until  fairly  recently  the  main  method  of  analy¬ 
sis  has  been  to  express  the  time  series  T(t)  in  a 
set  of  Fourier  inodes.  Then  peaks  in  the  associ¬ 
ated  power  spectrum  are  identified  with  normal 
ri.odes  of  the  system.  This  approach  breaks  down 
if  there  is  a  large  noisy  contribution  to  the 
measurement  or  if  the  underlying  system  cannot 
be  described  in  terms  of  a  few  modes. 

An  alternative  is  to  use  the  method  of  singular 
value  decomposition  [11,  12].  In  this  method 
T{t)  is  expanded  in  a  complete  set  of  modes 
i/  „(r),  not  necessarily  Fourier  modes,  but  a  set 
obtained  from  an  analysis  of  the  data  rather  than 
i'Tiposed  from  outside.  The  modes  are  normal- 
'  ed  according  to 

^  ^  ^  =  nr) ,  (1) 

where  the  original  data  are  assumed  given  at 
discrete  times  nr  with  1  <  n  <  A.  We  then  ap¬ 
proximate  T(t)  by 

d 

(2) 

m  *  I 

where  the  «/>„’s  are  chosen  to  correspond  to  the  d 
largest  values  of 


In  practice,  from  the  data  T,,  (=  T{t  =  nr))  one 
constructs  a  set  of  Af -dimensional  vectors  V,  de¬ 
fined  such  that  V,  =  [Tj,  Ti^\,  .  .  .  ,  and 

the  auto-correlation  function  C(n)  defined  such 
that 

,v 

C(n)=^T,T,^„.  (3) 

/=  1 

Using  these  C(/i)’s  one  constructs  the  symmetric 
A/  X  Af  correlation  matrix  M  with  elements 
M,p  =  C{\1  -  p\).  The  eigenvalues  of  this  matrix 
are  in  fact  just  the  normalization  constants  intro¬ 
duced  in  eq.  (1),  that  is  A„,.  The  corresponding 
eigenfunctions  (a,„)  of  this  matrix  give  the 
modes  4i„{t)  according  to  tf/Jl  =  nr)  =  a„,  •  V„. 

Besides  giving  the  best  set  of  orthogonal 
modes  in  the  sense  mentioned  above,  this  met¬ 
hod  involves  some  smoothing  of  the  original 
data.  A  purely  random  time  series  gives  C{n)  = 
C8„  ^^  so  that  all  the  eigenvalues  are  equal  to  C. 

If  when  the  data  are  analyzed  a  few  eigen¬ 
values  are  significantly  larger  than  the  rest,  then 
the  corresponding  eigenfunctions  are  the  ones 
used  in  the  approximate  expansion  for  T{t)  in 
eq.  (2).  The  neglect  of  the  rest  has  the  effect  of 
removing  some  of  the  noise.  This  whole  proce¬ 
dure  is  analogous  to  identifying  peaks  in  a 
Fourier  power  spectrum.  The  partial  removal  of 
noise  by  singular  value  decomposition  has  been 
discussed  in  more  detail  by  Broomhead  and  King 
[11].  The  choice  of  which  of  the  eigenvalues  are 
significant  is  not  always  obvious,  and  a  subjective 
judgement  has  been  used  here.  A  more  rigorous 
procedure  would  follow  the  treatment  of 
Hediger  et  al.  [13]. 

From  a  physical  point  of  view  the  i/<„,’s  for 
1  <  m  can  be  interpreted  as  coherent  struc¬ 
tures  revealed  by  the  method  of  singular  value 
decomposition.  Modern  dynamical  systems 
theory  suggests  that  even  small  values  of  d  may 
suffice  to  encapsulate  the  essential  features  of 
the  system.  These  features  are  perhaps  best  ap¬ 
preciated  by  examining  the  ^/-dimensional  phase 
space  constructed  using  the  functions  i/f, , 
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i/»2, .  . . ,  These  topological  features  are  the 
same  as  those  present  using  a  phase  space  con¬ 
structed  using  the  V„’s  since  the  V’s  are  linear 
combinations  of  the  Also,  features  that  are 
masked  in  the  V  phase  space  due  to  noise  may  be 
revealed  in  the  (/>  space  since  some  of  the  noisy 
component  has  been  removed.  The  more  subtle 
point  of  whether  a  time  series  of  a  single  variable 
known  just  at  discrete  time  intervals  can  capture 
the  full  solution  of  the  underlying  problem  which 
exists  in  continuous  time  and  involves  many 
independent  variables  has  been  considered  by 
Takens  [14]. 

Using  the  d  distinct  we  construct  a  model 

equation  of  the  form 

rj'-FM),  (4) 

where  t}f"„  =  =  nr)  and  the  F„’s  are  as  yet 

general  functions  of  the  tf/'s.  Guided  by  the  fact 
that  simple  forms  for  the  model  functions  F„  are 
sufficient  to  produce  chaotic  solutions  we  assume 
that 

d  d 

Fmi'I'i)  =  0^0+2  a  tl/p  +  E 

p=l  P<i 

d 

+  E  ,  (5) 

p.q.r 

where  the  coefficients  a,  b,  and  c  are  determined 
by  minimizing  the  variational  function 

4  =  ^2  (6) 

rt=i 

for  each  value  of  m.  If  examination  of  the  phase- 
space  portraits  reveals  any  symmetry  then  this 
should  be  incorporated  into  the  structure  of  4,. 
Of  course  in  some  cases  a  simple  polynomial 
may  not  be  the  appropriate  form  for  F^.  For 
example  if  the  phase-space  plots  reveal  periodic 
structure  then  the  model  functions  F^  should  be 
chosen  to  capture  such  a  structure.  A  suitable 
choice  of  polynomial  has  been  shown  [15]  to 
model  data  arising  from  the  presence  of  a  limit 


cycle.  An  4  expressed  as  the  ratio  of  two 
polynomials  may  have  a  significantly  wider  range 
of  application  than  a  simple  polynomial  since 
one  is  then  using  the  power  of  a  Fade  approxi- 
mant  [16]. 

The  use  of  such  rational  functions  has  been 
studied  by  Casdagli  [5]  and  Gouesbet  [10].  How¬ 
ever,  it  is  important  to  note  that  there  is  prob¬ 
ably  no  universal  panacea,  and  the  form  for  the 
model  functions  4  should  be  chosen  taking  into 
account  all  available  information  about  the  sys¬ 
tem.  Computer  software  that  carries  out  the 
procedure  described  above  as  well  as  many  other 
tests  for  chaotic  time  series  is  available  [17]. 

Singular  value  decomposition  methods  or 
equivalents  have  been  used  previously  to  obtain 
model  equations  [18-20],  but  in  those  cases  the 
exact  equations  describing  the  system  are  as¬ 
sumed  known.  The  exact  equations  are  then 
used  to  generate  the  model  equations.  Here  we 
only  use  the  restricted  information  given  by  the 
time  series. 

Experimental  time  series  T^t)  can  often  be 
obtained  for  various  values  of  some  control  par¬ 
ameter  II .  Thus  the  4's  and  4  ®  are  also 
functions  of  ft.  Numerical  simulations  of  equa¬ 
tions  such  as  those  of  Lorenz  [21]  and  Rossler 
[22]  and  the  study  of  phase  transformations  using 
phenomenological  models  such  as  the  Landau- 
Ginsburg  equation  give  good  reason  to  believe 
that  the  dependence  of  the  coefficients  a,  b,  and 
c  on  /X.  is  simple. 

3.  Numerical  examples 

To  illustrate  some  of  the  above  techniques  we 
have  studied  a  few  selected  model  situations. 

3.1.  Logistic  equation 

First  the  logistic  equation  x„^^  =  Ajt„(l  -  x^  ) 
has  been  iterated  and  x„  identified  with  4-  Fo'" 
A  =  4  the  solution  is  illustrated  in  fig.  la.  The 
data  have  been  analyzed  using  M  =  2.  and  the 
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Fig.  1.  Phase-space  plots  of  the  logistic  equation  (a)  from 
original  data  and  (b)  after  singular  value  decomposition. 

phase  plane  in  fig.  lb  constructed  from  the 
model  equations  shows  a  single  loop  which  is  a 
simple  distortion  of  fig.  la.  This  simple  loop 
structure  still  remains  if  larger  values  of  M  are 
used  and  the  number  of  model  equations  is  taken 
equal  to  M.  Even  without  knowledge  of  the 
original  equations  that  generated  the  data  the 
method  shows  that  a  two-dimensional  phase 
plane  is  sufficient  to  model  the  data,  and  the 
resulting  equations  can  be  linearly  combined  to 
recover  the  logistic  equation  exactly. 

3.2.  Henon  equations 

The  Henon  map,  =  1  -  1.4x1  +  0.3y„, 


y-4..  = has  been  treated  in  a  similar  manner. 

•r  n  T  I  n 

The  results  in  fig.  2  using  M  =  2  again  show  that 
the  model  equations  capture  the  essential  fea¬ 
tures  of  the  solution,  in  this  case  a  strange 
attractor. 

A  time  series  of  2300  values  was  generated  by 
adding  normally  distributed  deviates  with  zero 
mean  and  standard  deviation  of  0.1.  These  data 
for  M  =  2  are  shown  in  fig.  3a.  Using  these  data, 
two  coupled  model  equations  were  obtained, 
solved,  and  a  new  time  series  X„  =  tlr"  +  lA"  gen¬ 
erated.  This  is  shown  in  fig.  3b  and  is  indisting¬ 
uishable  from  the  Henon  map  as  given  in  fig.  2a. 

In  this  case  the  reduction  in  noise  is  solely  a 


Fig.  2.  Phase-space  plots  of  the  Henon  map  (a)  from  original 
data  and  (b)  after  singular  value  decomposition. 
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Fig.  3.  Phase-space  plots  of  the  Henon  map  (a)  with  noise 
added  and  (b)  from  a  solution  of  the  model  equations  fit  to 
the  noisy  data.  A  comparison  of  (b)  with  hg.  2a  shows  that 
the  method  has  completely  removed  the  noise  and  restored 
the  original  data. 

result  of  forcing  the  data  to  fit  a  relatively  simple 
set  of  equations  since  only  a  2  x  2  correlation 
matrix  was  used  and  thus  all  the  noise  survives 
the  singular  value  decomposition.  Such  a  method 
should  be  used  with  caution  since  it  tends  to 
simplify  the  dynamics  of  the  system. 

3.3.  Lorenz  equations 

The  Lorenz  equations 
dx/df  =  (T{y  -  x)  , 
dy/df  =  rx  -  y  -  xz  , 

dz/df  =  xy  -  bz  ,  (7) 


with  O’ =  10,  r  =  28,  and  b  =  8l3  were  solved 
numerically  and  10(X)  values  of  xU)  at  /  =  0.05ri 
taken  as  the  input  time  series.  These  data  are 
shown  in  fig.  4a.  The  neglect  of  the  information 
contained  in  the  variables  y(t)  and  z(t)  mirrors 
the  experimental  situation  where  only  a  limited 
amount  of  information  is  available.  The  corre¬ 
sponding  phase  space  constructed  using  three 


Fig.  4.  Three-dimensional  phase-space  plots  of  the  Lorenz 
attractor  showing  that  the  topology  of  the  attractor  is  pre¬ 
served.  (a)  Original  input  data,  (b)  result  of  singular  value 
decomposition,  and  (c)  solution  of  the  model  equations. 


256 


G.  Rowlands,  J.C.  Sprott  I  Dynamical  equations  from  chaotic  data 


eigenfunctions  corresponding  to  the  three  largest 
eigenvalues  is  shown  in  fig.  4b.  This  phase-space 
plot  is  insensitive  to  the  value  of  M.  A  model  set 
of  three  equations  was  then  constructed  using 
the  full  cubic  form  of  F  given  in  eq.  (5).  Their 
solution  is  shown  in  fig.  4c  and  is  seen  to  capture 
the  essentials  of  the  time  behavior.  The  plots 
obtained  using  the  model  equations  are  for  a 


Fig.  5.  Three-dimensional  phase-space  plots  of  the  Rossler 
attractor  showing  that  the  topology  of  the  attractor  is  pre¬ 
served.  (a)  Original  input  data,  (b)  result  of  singular  value 
decomposition,  and  (c)  solution  of  the  model  equations. 


much  longer  time  than  the  original  data  were 
given. 

The  correlation  dimension  calculated  using  the 
method  of  Grassberger  and  Proccacia  [1]  with 
the  original  data  set  of  1000  points  is  1.97  ±  0. 18, 
and  the  value  calculated  from  13  000  values  gen¬ 
erated  by  solving  the  model  equations  is  2.10± 
0.10.  This  is  to  be  compared  with  the  accepted 
value  [1]  of  2.05  ±0.01.  It  has  been  pointed  out 
by  Ott  et  al.  [23]  that  correlation  dimensions  are 
not  necessarily  invariant  under  coordinate 
changes.  However,  in  the  present  case,  since  the 
«/r„’s  are  just  linear  combinations  of  the  T{nr)'^, 
the  correlation  dimension  must  be  invariant. 

3.4.  Rossler  equations 

A  similar  treatment  has  been  applied  to  the 
Rossler  equations 

dx/dt  =  -{y  +  z) , 
dyidt  =  X  +  ay  , 

dz/dt  =  13  +  z{x  -  y) ,  (8) 

with  a  =  j8  =  l/5,7'  =  5.7,  and  t  =  0.2n.  The  cor¬ 
responding  results  are  shown  in  fig.  5.  The  corre¬ 
lation  dimension  for  this  case  calculated  from  the 
original  data  set  of  1000  points  is  1.92  ±0.08, 
and  the  value  calculated  from  15  000  values  gen¬ 
erated  by  solving  the  model  equations  is  1.94  ± 
0.08.  The  expected  value  is  slightly  greater  than 
2.0. 

4.  Sensitivity  to  parameters 

This  whole  procedure  has  been  carried  out  for 
the  Lorenz  equations  for  a  range  of  r  values 
between  25  and  90,  and  in  particular  the  co¬ 
efficients  a,  b,  and  c  appearing  in  eq.  (5)  were 
evaluated  as  a  function  of  r.  The  variation  with  r 
of  the  coefficients  of  the  largest  nine  terms  is 
shown  in  fig.  6a  from  which  it  is  seen  that  the 
variation  is  reasonably  smooth.  From  the  sym- 
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Fig.  6.  (a)  Variation  of  the  nine  largest  coefficients  of  the 
model  equations  with  the  parameter  r  in  the  Lorenz  equa¬ 
tions,  (b)  along  with  a  least  squares  fit  of  each  coefficient  to  a 
cubic  polynomial  in  r. 

metry  of  the  Lorenz  equations,  the  terms  involv¬ 
ing  even  powers  of  ij/  and  are  neglig¬ 
ibly  small.  Using  the  least  squares  method,  the 
coefficients  are  readily  fitted  to  simple  polyno¬ 
mials  in  r.  A  cubic  fit  as  shown  in  fig.  6b  is 
sufficient. 

One  now  has  a  set  of  dynamical  equations  of 
the  form  given  by  eqs.  (4)  and  (5)  where  the 
coefficients  a,  b,  and  c  are  known  in  the  form  of 
simple  polynomials  in  the  parameter  r.  It  is  on 
this  set  of  equations  that  one  can  base  an  inter¬ 
polation  or  extrapolation  procedure.  By  taking  r 
values  other  than  those  measured,  and  solving 
the  model  equations,  the  behavior  of  the  system 


(a) 


rsi3 


(b)  fS12 


rsi3 


Fig.  7.  Phase-space  portraits  for  the  Lorenz  attractor  with 
r  =  57  (a)  obtained  directly  from  singular  value  decomposi¬ 
tion.  and  (b)  resulting  from  solution  of  the  model  equations 
with  coefficients  calculated  from  least  squares  fits  to  cubic 
polynomials  in  r. 

can  be  predicted.  This  can  be  in  the  form  of  the 
relevant  phase-space  plot  or  by  using  eq.  (2)  to 
form  x(t). 

The  phase-space  portrait  for  r  =  51  obtained  di¬ 
rectly  from  the  values  of  the  t/f’s  is  shown  in  fig. 
7a,  while  the  form  predicted  using  the  above 
procedure  is  shown  in  fig.  7b.  The  agreement  is 
good.  Extrapolation  outside  the  range  of  mea¬ 
sured  values  should  be  applied  with  caution, 
however,  since  the  least  squares  method  of  fit¬ 
ting  curves  to  polynomials  is  not  ideal. 

5.  Discussion 

In  the  above  method  there  are  two  quantities, 
namely  M,  the  order  of  the  correlation  matrix 
and  d,  the  number  of  significant  eigenfunctions 
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retained.  These  are  to  be  considered  as  parame¬ 
ters  of  the  method  which  can  be  adjusted  to 
obtain  the  best  fit  between  the  real  system  under 
investigation  through  the  data  T{t)  and  the  solu¬ 
tion  of  the  model,  eq.  (4).  Since  we  envisage 
applying  the  method  to  situations  where  the 
auto-correlation  function  shows  little  structure, 
we  hope  the  complicated  time  variation  can  be 
attributed  to  the  presence  of  a  strange  attractor. 
Then  the  parameters  M  and  d  are  chosen  to 
represent  best  the  topological  features  of  the 
attractor. 

An  alternative  approach  would  be  to  introduce 
constraints  into  the  quantity  that  is  to  be  minim¬ 
ized.  For  example,  if  it  is  apparent  from  the 
phase-space  plots  that  the  phase  portrait  has 
certain  symmetry  properties,  then  a  term 

A  S  +  1)  M]  -  OF„[t(t.(s  A/)]}^  (9) 

i 

could  be  added  to  eq.  (6).  Here  O  is  the  sym¬ 
metry  operator,  and  A  is  a  Lagrangian  multiplier. 
Furthermore,  one  may  impose  a  smoothness 
condition  on  the  fit  by  adding  a  term  which 
minimizes  the  average  second  derivative, 

i 

-2F„il,j[{s-l)M]}\  (10) 

The  Lagrangian  multipliers  can  then  be  used  to 
get  the  best  fit  to  the  coefficients  a,  b,  and  c. 
However,  the  results  given  here  are  optimized 
only  by  changing  M. 

The  results  for  the  Lorenz  and  Rossler  equa¬ 
tions  have  been  obtained  using  the  value  of  x„  at 
only  1000  points.  The  phase-space  portraits  for 
the  model  equations  are  shown  for  times  longer 
than  a  thousand  time  intervals,  illustrating  the 
stability  of  the  equations. 

However,  the  coefficients  in  the  model  equa¬ 
tions  and  hence  the  solution  of  these  equations 
depend  sensitively  on  the  order  of  the  correla¬ 
tion  matrix  M.  Though  the  value  of  T(t)  (that  is 
x)  generated  using  eq.  (2)  with  d  =  3  is  in  good 


agreement  (over  the  time  where  jc(/)  is  given) 
with  the  original  data,  the  associated  model 
equations  do  not  reconstruct  the  strange  attrac¬ 
tor.  Usually  after  a  short  interval  of  time  the 
solutions  tend  to  become  infinite  or  attract  to  a 
fixed  point  or  limit  cycle.  There  is  an  optimal 
choice  of  M  for  the  reconstruction  of  the  attrac¬ 
tor.  It  is  reasonable  to  expect  this  value  to  be 
associated  with  (a)  the  maximum  difference  be¬ 
tween  A,  and  the  higher  eigenvalues  and  (b)  that 
the  elements  C(n)  used  in  the  correlation  mat¬ 
rices  span  the  region  where  the  major  variation 
of  C  occurs.  For  the  results  presented  in  the  case 
of  the  Lorenz  equation,  a  value  of  9  has  been 
found  to  be  appropriate,  while  for  the  Rossler 
equation,  because  of  the  longer  correlation  time, 
it  was  found  optimal  to  make  M  =  16. 

The  relative  difficulty  of  finding  chaotic  solu¬ 
tions  to  the  model  equations  raises  the  more 
general  question  of  how  common  is  chaos.  A 
numerical  experiment  was  carried  out  in  which 
about  lO*"  three-dimensional  cubic  maps  of  the 
form  given  by  eqs.  (4)  and  (5)  were  iterated  with 
the  60  coefficients  chosen  randomly  over  a  60- 
dimensional  hypercube  with  each  side  extending 
from  -1.2  to  1.2.  Initial  conditions  were  chosen 
near  the  origin,  and  the  Lyapunov  exponent  was 
calculated  for  each  case.  About  99%  of  the 
solutions  were  unstable.  (This  number  increases 
rapidly  with  the  size  of  the  hypercube.)  Of  the 
roughly  10  000  stable  solutions,  which  are  candi¬ 
dates  for  modeling  bounded  physical  processes, 
about  two-thirds  attract  to  a  fixed  point  and 
about  one-third  are  either  limit  cycles  or  two- 
toruses.  A  small  subset  of  about  4%  are  strange 
attractors  with  positive  Lyapunov  exponents. 

Thus  we  conclude  that  for  the  subset  of  phe¬ 
nomena  that  can  be  represented  by  three-dimen¬ 
sional  cubic  maps,  nature  is  chaotic  about  4%  of 
the  time.  There  is  some  evidence  to  suggest  that 
the  regions  of  parameter  space  corresponding  to 
chaotic  motion  are  elongated.  This  means  that  as 
long  as  the  parameter  change  is  along  the  direc¬ 
tion  of  elongation  the  system  has  a  degree  of 
robustness.  The  imposition  of  a  symmetry  re- 
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quirement  tends  to  elongate  the  chaotic  region. 
A  by-product  of  this  calculation  was  the  genera¬ 
tion  of  several  hundred  new  examples  of  strange 
attractors,  no  two  of  which  look  the  same. 


6.  Conclusion 

A  method  has  been  described  for  determining 
a  set  of  model  equations  from  limited  data  whose 
global  solutions  resemble  those  of  the  original 
data.  However,  the  method  is  not  robust,  and 
the  existence  of  a  strange  attractor  depends  sen¬ 
sitively  on  the  value  of  M.  This  sensitivity  might 
be  reduced  by  a  better  choice  of  the  variational 
function. 


Acknowledgements 

One  of  us  (G.R.)  would  like  to  acknowledge 
fruitful  discussion  with  J.  Alex  Thomson  at  an 
early  stage  of  this  work  and  the  hospitality  of  the 
University  of  Wisconsin  -  Madison  Physics  De¬ 
partment.  This  work  was  supported  in  part  by 
the  US  Department  of  Energy. 

References 

(1)  P.  Grassberger  and  1.  Procaccia,  Physica  D  9  (1983)  189. 


(2|  A.M.  Lyapunov,  Ann.  Math,  Studies  17  (Princeton 
Univ.  Press.  Princeton,  NJ,  1949). 

|3|  G.  Benettin.  L.  Galgani.  A,  Girogilli  and  J.-M.  Strel- 
cyn,  Meccanica  15  ( 1980)  9. 

|4l  1.  Shimada  and  T.  Nagashima,  Prog.  Theor.  Phys.  61 
(1979)  1605. 

(5|  M.  Casdagli,  Physica  D  35  (1989)  335. 

16|  J.D.  Farmer  and  J.J.  Sidorowich,  Phys.  Rev.  Lett.  59 
(1987)  845. 

(7)  J.P.  Crutchfield  and  B.S.  McNamara,  Complex  Systems 

1  (1987)  417, 

18]  N.H.  Packard  et  al.,  Phys.  Rev.  Lett.  45  (1980)  712. 
|9j  J.  Cremers  and  A.  Hiibler,  Z.  Naturforsch  A  42  (1986) 
797. 

(10]  G.  Gouesbet,  Phys.  Rev.  A  43  (1991)  5321. 

(11)  D.S.  Broomhead  and  G.P.  King.  Physica  D  20  (1986) 
217. 

(12|  R.  Vautard  and  M.  Ghil.  Physica  D  35  (1989)  395. 

(13|  T.  Hediger.  A.  Passamante  and  M.E.  Farrell.  Phys. 
Rev.  A  41  (1990)  5325. 

[141  F.  Takens,  Detec' ing  Strange  Attractors  in  Turbulence. 
Lecture  Notes  in  Mathematics.  D.A.  Rand  and  L.-S. 
Young,  eds.  (Springer.  Berlin,  1981)  p.  366. 

(15)  J.R.  Rice,  The  Approximation  of  Functions.  Vol.  1  and 

2  (Addison-Wesley,  Reading.  MA,  1969). 

(16)  P.R.  Graves-Morris,  Pade  Approximants  and  their  Ap¬ 
plication  (Academic,  New  York.  1973). 

(17)  J.C.  Sprott  and  G.  Rowlands.  Chaos  Data  Analyzer, 
Physics  Academic  Software.  Box  8202,  North  Carolina 
State  University,  Raleigh,  NC  27695. 

[18|  N.  Aubry,  P.  Holmes,  J.L.  Lumley  and  E.  Stone.  J. 
Fluid  Mech.  192  (1988)  115. 

(19)  L.  Sirovich  and  J.D.  Rodriguez.  Phys.  Lett.  A  120 
(1987)  211. 

(20)  K.S.  Ball.  L.  Sirovich  and  L.R.  Keefe,  Int.  J.  Num 
Methods  Fluids  12  (1991)  585. 

(21)  E.N.  Lorenz,  J.  Atmos.  Sci.  20  (1963)  130. 

(22)  O.E.  Rossler.  Phys.  Lett.  A  57  (1976)  397. 

(23)  E.  Ott,  W.D.  Withers  and  J.A.  Yorke,  J,  Stat.  Phys.  36 
(1984)  687. 


Physica  D  58  (1992)  260-272 
North-Holland 


Global  unpredictability  in  nonlinear  dynamics:  capture, 
dispersal  and  the  indeterminate  bifurcations 

J.M.T.  Thompson 

Centre  for  Nonlinear  Dynamics  and  its  Applications.  Civil  Engineering  Building.  University  College  London. 
Gower  Street,  London.  WCIE  6BT,  UK 

Received  30  August  1991 

Revised  manuscript  received  6  February  1992 

Accepted  6  February  1992 


This  paper  surveys  in  general  terms  some  of  the  problems  of  predictability  that  arise  in  regular  and  chaotic  dynamics. 
These  relate  to  attractors,  basins  and  bifurcations,  and  a  capture-dispersal  diagram  is  introduced  to  assess  relative  degrees 
of  global  unpredictability.  Indeterminate  bifurcations  emerge  as  the  most  severe  generators  of  unpredictability,  and  some 
new  examples  involving  both  regular  and  chaotic  events  are  presented  as  illustrations. 


1.  Introduction 

The  unpredictability  of  a  dynamical  system  can 
be  examined  locally,  in  terms  of  the  uncertainty 
surrounding  the  simulation  of  a  required  trajec¬ 
tory,  or  globally  in  terms  of  uncertainties  of 
response  within  families  of  trajectories.  Key 
qualitative  indices  for  global  unpredictability  are 
capture  (c)  and  dispersal  (d),  a  severe  phenom¬ 
enon  being  one  with  high  c  -i-  d  that  captures  a 
lot  of  the  dynamics  and  then  disperses  it  widely. 
Plotted  on  a  capture-dispersal  diagram,  as  in  fig. 
1 ,  these  coordinates  give  a  useful  overview  of  the 
sources  of  unpredictability  within  the  attractors, 
basins  and  bifurcations  of  nonlinear  dissipative 
dynamics. 

Among  the  attractors,  it  is  of  course  steady 
state  chaos  that  generates  the  greatest  degree  of 
unpredictability.  The  chaotic  attractor  captures 
all  trajectories  within  its  basin  and  disperses 
them  by  a  spreading,  folding  and  mixing  action 
within  a  well-defined,  albeit  fractal,  topological 
structure.  Basin  boundaries  are  an  obvious 
mechanism  of  dispersal  within  the  trajectories  of 


a  system.  A  smooth  or  fractal  boundary  can 
achieve  dispersal  to  an  arbitrary  number  of  dis¬ 
tinct  and  remote  attractors,  the  fractal  boundary 
generating  additionally  the  mixing  action  of  cha¬ 
otic  transients  associated  with  chaotic  saddle 
solutions.  Basin  boundaries  thus  score  well  on 
dispersal,  but  badly  on  capture  because  only 
trajectories  “within'"  the  boundary  zone  are  in¬ 
volved. 

The  discontinuous-dangerous  bifurcations, 
characterized  by  the  blue-sky  disappearance  of 
an  attractor,  give  an  inevitable  fast  dynamic 
jump  to  a  remote  steady  state  (a  point,  periodic 
or  chaotic  attractor).  Most  can  be  indeterminate, 
in  the  sense  that  more  than  one  outcome  is 
possible,  the  attractor  chosen  depending  sensi¬ 
tively  on  the  precise  manner  in  which  the  bifur¬ 
cation  is  realised.  Regular  local  bifurcations  can 
exhibit  this  indeterminacy,  and  an  indeterminate 
tangled  saddle-node  is  a  eommon  bifurcational 
feature  in  the  resonance  of  softening  oscillators 
|1-3|.  These  indeterminate  bifurcations  can  be 
seen  as  a  mechanism  for  sweeping  all  trajecttiries 
from  a  basin  in  phase-control  space  precisely 
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♦  - - 

ATTRACTOR  PATHS  and  DETERMINATE  BIFURCATIONS 


All  trajectories  in 
phase  space  basin 
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attractor 
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attractor 


QUASI-PERIODIC 

attractor 


CHAOTIC 

attractor 


d 


Trajectories  near  a 
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fractal  basin 
BOUNDARY 


Trajectories  near  a  -2 
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■  ■ 
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BOUNDARY 


DISPERSAL 

T'ow  wide  is  the  dispersal? 


Two  different  Two  or  more 

Dispenal  wuhin  a  sin|le  ailiacior  aiiraciors  alitaclors 


Fig.  1.  A  notational  capture-dispersal  diagram,  summarizing  the  degree  of  global  unpredictability  generated  by  different  features 
of  the  phase-control  portrait  of  a  nonlinear  dissipative  system. 


onto  a  (smooth  or  fractal)  basin  boundary,  froni 
which  they  are  dispersed  to  two  or  more  un¬ 
correlated  and  remote  attractors.  They  achieve 
the  high  dispersal  of  a  basin  boundary  while 
capturing,  during  a  slow  parameter  sweep,  a  full 
basin  of  trajectories.  The  indeterminate  bifurca¬ 
tions  therefore  generate,  in  a  very  real  sense,  the 
highest  degree  of  global  unpredictability  in  regu¬ 
lar  and  cha.,>ll:  dynamics. 

2.  Predictability  in  simuiatiuns 

Before  addressing  the  main  theme  of  global 
uncertainty  involving  families  of  trajectories,  we 
review  the  problem  of  simulating  a  single  trajec¬ 
tory  in  the  presence  of  chaotic  motions.  In  non¬ 
linear  dynamics,  chan  t  terized  for  example  by  a 
set  of  ODEs,  there  is  always  a  unique  trajectory 
from  a  given  starting  condition.  So  even  when 
the  behaviour  is  in  modern  parlance  chaotic,  the 


motion  is  strictly  deterministic.  However,  for 
even  the  simplest  nonlinear  system  there  is  little 
possibility  of  a  closed-form  solution:  often  there 
is  no  analytical  solution,  as  in  the  presence  of 
chaotic  motions.  So  resort  must  be  made  to 
computers,  which  brings  us  to  the  more  prag¬ 
matic  consideration  of  predictability. 

In  the  real  world,  be  it  in  an  analogue  simula¬ 
tion,  in  a  laboratory  experiment  or  in  the  outside 
environment,  any  system  is  subjected  to  random 
external  disturbances,  described  as  noise:  macro¬ 
scopic  mechanical  systems  also  experience  inter¬ 
nal  noise  from  thermal  vibrations  of  individual 
molecules.  Likewise  in  digital  computations,  a 
required  trajectory  will  be  constantly  perturbed 
by  numerical  noise.  This  arises  firstly  from  the 
numerical  integration  scheme  itself,  and  secondly 
from  round-off  errors  of  the  finite  precision 
arithmetic.  Referring  collectively  to  analogue, 
digital  and  laboratory  investigations  as  simula¬ 
tions,  what  possibility  is  there  of  using  these  to 
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predict  the  future  in  the  presence  of  this  all 
pervasive  noise? 

To  answer  this  question  requires  a  stochastic 
analysis.  But  we  gain  some  insight  by  considering 
adjacent  motions  of  the  original  idealized  noise- 
free  dynamical  system,  S.  Noise  tends  to  knock 
the  system  from  one  trajectory  to  another.  So  if 
adjacent  trajectories  stay  close,  or  better  still 
converge,  there  is  hope  that  simulations  will  give 
a  good  prediction.  Conversely,  if  adjacent  trajec¬ 
tories  diverge,  we  might  expect  there  to  be  a 
practical  predictability  horizon  [4]  beyond  which 
any  simulations  will  depart  significantly  from  the 
desired  trajectory. 

In  any  large  but  finite  time,  T,  there  is,  we 
recall,  a  strict  continuity  of  response  of  system  S. 
If  trajectories  in  phase  space  from  two  starting 
points  jc(0)  and  A!'(0)  lead  in  time  T  to  points 
x{T)  and  X{T),  then  as  x(0)  approaches  arbit¬ 
rarily  close  to  ^'(O),  so  x{T)-^  X(T).  Similar 
continuity  holds  for  a  small  perturbation  of  S, 
induced  for  example  by  a  small  change  in  a 
control. 

Clearly  we  must  consider  not  just  one,  but  all 
adjacent  trajectories.  Now  for  the  autonomous 
set  of  first-order  differential  equations,  Jt,  = 
FXXj),  where  a  dot  denotes  differentiation  with 
respect  to  time,  t,  we  have  the  divergence 
function 

div(jt,)  =  dFildXi  +  dF^/dXj  +  dF^ldXj  +  ■  •  ■  . 

We  might  suppose  that  totally  dissipative  systems 
for  which  divfx,),  although  not  necessarily  con- 
Siuiit,  is  everywhere  negative,  would  be  the  most 
predictable:  and  for  this  reason,  we  shall  focus 
on  them  in  this  section. 

With  positive  definite  dissipation,  any  cloud  of 
starts  contracts  continuously  with  its  volume 
tending  exponentially  to  zero  as  t—*<^.  A  typical 
solution  will  experience  a  transient  converging 
asymptotically  onto  a  steady  state  attractor  of 
zero  phase  volume.  These  steady  states  can  be 
point  attractors,  periodic  or  quasi-periodic  at¬ 
tractors,  or  chaotic  attractors. 


It  is  of  course  the  chaotic  attractors  that  give 
the  greatest  predictability  problem.  In  these 
have  such  a  rapid  contraction  onto  a  sheet  that 
there  can  be  exponential  divergence  within  the 
sheet,  consistent  with  the  overall  volume  con¬ 
traction.  A  repetitive  feature  of  the  dynamics, 
ensures  that  this  sheet  is  repeatedly  folded,  as  in 
the  making  of  flaky  pastry.  The  result  is  an 
infinitely  layered  fractal  structure  in  phase  space, 
the  chaotic  attractor,  within  which  the  post-tran¬ 
sient  motions  wander  for  ever  in  an  essentially 
random  manner.  It  is  the  combined  effects  of 
divergence  and  mixing  within  a  well-defined  and 
localized  fractal  structure  that  characterizes  a 
chaotic  attractor  and  poses  limits  to  the  detailed 
predictability  of  its  motions.  So  even  in  a  totally 
dissipative  system,  a  trajectory  in  a  chaotic  at¬ 
tractor  has  adjacent  diverging  trajectories  within 
the  sheet.  This  divergence  is  exponential,  mak¬ 
ing  long  term  predictions  impossible. 

Consider,  for  example,  the  tracking  of  a  fun¬ 
damental  trajectory  of  S  by  an  adjacent  trajec¬ 
tory  of  S  whose  starting  coordinates  differ  by  a 
small  perturbation,  e.  If  the  error  has  the  form  e 
exp(  At)  with  A  >  0,  this  will  reach  a  level  of 
unacceptability,  E,  at  a  predictability  horizon, 
H,  given  by  AH  =  ln(£/e).  If  we  repeatedly 
halve  the  perturbation  to  e/2,  e/4, .  .  .  ,  the  new 
predictability  horizon  is 

AH  =  ln(£/e)  +  ln(2)  -I-  ln(2) 

Every  time  we  double  the  precision  of  the  initial 
condition,  we  just  succeed  in  pushing  the  horizon 
forward  by  one  step  of  A// =  (1 /A)  ln(2).  The 
horizon  changes  only  linearly  with  the  number  of 
decimal  places  used  to  describe  the  initial  condi¬ 
tions.  Moreover  the  mixing  action  of  a  chaotic 
attractor  ensures  that  beyond  the  horizon  the 
successively  improved  simulations  do  not  march 
towards  the  required  solution  in  a  simple  sys¬ 
tematic  manner. 

A  similar  situation  prevails  when  we  try  to 
simulate  the  response  of  S.  from  a  fixed  starting 
condition  using  a  digital  computer.  Every  time 
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Fig.  2.  Chaotic  tumbling  of  a  parametrically  excited  pendulum,  showing  the  variation  of  the  predictability  horizon  with  the  size  of 
the  Runge-Kutta  time  step.  Parameter  values  are  /3=0.2,  /4  =  1.59,  6t>  =  1.79.  All  runs  start  at  a:(0)  = -1.737,  y(0)  =  2.026. 
Window  of  the  time  series  is  0  <  r  <  70,  -  14it  <  x<  14-n-. 


we  double  the  precision  of  the  numerical  integra¬ 
tion  we  can  only  expect  to  increase  the  predic¬ 
tability  horizon  by  a  linear  step.  This  is  illus¬ 
trated  in  fig.  2  summarizing  seven  Runge-Kutta 
simulations  of  the  chaotic  tumbling  of  a  parame¬ 
trically  excited  pendulum  described  by 

x  +  Px  +  [1  +  A  cos(a>f)]  sin(A:)  =  0  . 

The  start,  roughly  on  a  predetermined  chaotic 
attractor,  is  held  constant  throughout.  A  fourth- 
order  Runge-Kutta  algorithm  is  employed  with  / 
steps  per  forcing  cycle,  /  =  8,  16, . .  . ,  512.  Com¬ 
parative  plots  of  jc(f)  are  shown  in  the  left-hand 


graphs.  The  roughly  linear  variation  of  the  pre¬ 
dictability  horizon  under  halving  of  J  is  illus¬ 
trated  in  the  table  which  compares  the  signs  of 
the  stroboscopically  sampled  values  of  x(t)  =  y{t) 
at  phase  <f}  =  0. 

Although  detailed  correlation  is  lost  at  the 
predictability  horizon,  the  strong  phase  contrac¬ 
tion  onto  the  folding  sheet  ensures  that  the  more 
accurate  simulations  all  stay  close  to  a  well- 
defined  attracting  fractal  structure,  which  is 
therefore  presumed  to  be  that  of  system  S.  Fig.  3 
shows  the  similar  chaotic  attractors  obtained  at 
J  =  64  and  32.  Comparable  pictures  are  gener¬ 
ated  by  totally  different  algorithms,  which  lends 


Fig.  3.  Stroboscopically  sampled  chaotic  attractor  of  the  tumbling,  parametrically  excited  pendulum.  Qualitatively  similar 
attractors  obtained  at  different  values  of  the  integration  step  size.  Window  is  -tt  <  x  <  it,  -4  <  y  <  4.  Data  as  in  fig.  2. 
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support  to  the  view  that  numerically  drawn  cha¬ 
otic  attractors  are  meaningful  representations  of 
the  underlying  dynamics. 

In  this  context  we  can  refer  to  Grebogi  et  al. 
[5]  who  discuss  the  relationship  between  a  com¬ 
puter-generated  “noisy”  trajectory  and  a  true 
one,  with  applications  to  two  representative 
Hamiltonian  systems.  They  develop  a  rigorous 
procedure  to  show  that  some  true  trajectories 
remain  close  to  the  noisy  one  for  long  times.  To 
show  that  a  true  trajectory  shadows  the  noisy 
trajectory  in  this  way,  they  employ  a  combina¬ 
tion  of  containment,  which  establishes  the  exist¬ 
ence  of  an  uncountable  number  of  true  trajec¬ 
tories  close  to  the  noisy  one,  and  refinement, 
which  produces  a  less  noisy  trajectory.  This  is 
applied  successfully  to  the  noisy  chaotic  trajec¬ 
tories  of  the  standard  map  and  the  undamped 
driven  pendulum. 

Despite  this  preservation  of  the  underlying 
fractal  structure,  a  chaotic  attractor  will  always 
have  a  severe  predictability  horizon  for  a  single 
trajectory,  preventing  detailed  long  term  correla¬ 
tions  between  any  of  the  following:  (a)  the  fun¬ 
damental  trajectory  of  the  idealized  system;  (b) 
any  adjacent  trajectory  of  the  idealized  system; 
(c)  any  trajectory  of  an  idealized  perturbed  sys¬ 
tem;  (d)  any  individual  trajectory  of  a  labora¬ 
tory,  analogue  or  digital  simulation;  (e)  any  in¬ 
dividual  trajectory  of  a  real-world  prototype 
system. 

3.  Capture  and  dispersal 

In  a  global  sense,  the  degree  of  unpredictabili¬ 
ty  of  a  system  is  clearly  related  to  ideas  of 
capture  and  dispersal,  as  we  have  intimated.  So 
we  shall  now  focus  on  this  aspect,  and  look  in 
turn  at  attractors,  basins  and  bifurcations. 

3.1.  The  attractors 

An  attractor  captures  all  trajectories  initialized 
within  its  basin  and  draws  them  towards  a  central 


attracting  set.  It  scores  a  high  capture  measure, 
but  has  a  low  (perhaps  we  should  say  negative) 
dispersal.  The  attractors  can  therefore  be 
positioned  as  shown  on  our  schematic  capture- 
dispersal  diagram  of  fig.  1.  Here,  in  recognition 
of  the  different  compactions  achieved,  we  have 
assigned  increasing  dispersal  measures  as  we  pro¬ 
gress  from  a  point  attractor,  through  periodic 
and  quasi-periodic  attractors,  to  the  chaotic  at¬ 
tractor. 

3.2.  The  smooth  basin  boundary 

The  simplest,  and  most  direct  way  to  obtain  a 
dramatic  splitting  of  two  “adjacent”  trajectories 
is  to  place  the  two  starting  points,  x(0)  and  Ar(0), 
so  that  they  straddle  a  point  B  on  a  smooth  basin 
boundary.  Then,  as  t— >=0,  the  two  motions  will 
diverge  and  head  towards  two  totally  different 
and  remote  attractors.  It  is  important  to  re¬ 
member,  though,  that  this  behaviour  does  not 
violate  the  continuous  dependence  on  initial  con¬ 
ditions  in  finite  time,  because  the  essential  split¬ 
ting  action  will  invariably  involve  slow  dynamical 
motions  near  an  unstable  steady  state. 

Consider  first  a  boundary  in  a  2D  phase  space 
formed  by  the  stable  manifold  (inset)  of  a  saddle 
fixed  point,  D.  Trajectories  from  a  line  of  starts 
joining  [ji:(0),  A'(0)]  will  slow  down  as  they  ap¬ 
proach  D  given  a  line  [ji:(7’),  A'(7')]  parallel  to 
the  outset  of  D,  and  approaching  it  slowly  as 
T-*<x>.  Increasing  T  just  gives  a  similar  picture, 
trajectories  approaching  closer  to  the  outset  but 
still  exhibiting  the  continuous  dependency. 

Despite  this  strict  mathematical  continuity, 
trajectories  from  two  points  straddling  the 
boundary  which  are  located  at  small  but  not 
infinitesimal  distances  from  B,  will  be  perceived 
to  disperse  the  response  to  the  two  remote  at¬ 
tractors.  In  this  sense  we  shall  say  that  a  smooth 
boundary  formed  by  the  inset  of  a  saddle  in 
achieves  a  dispersal  to  two  different  attractors. 

An  unstable  limit  cycle  is  a  second  example  of 
a  basin  boundary  in  a  2D  flow.  Trajectories 
initialized  just  outside  the  cycle  head  out  to  an 
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exterior  attractor  while  those  initialized  inside 
head  towards  an  interior  attractor.  Once  again, 
the  splitting  involves  infinite  time  associated  with 
the  slow  movement  away  from  the  cycle. 

With  such  an  unstable  cycle  it  is  easy  to  devise 
a  situation  in  which  the  interior  motions  tend, 
not  to  a  single  attractor,  but  to  two  or  more 
distinct  attractors.  This  is  illustrated  in  fig.  4a, 
which  has  just  two  interior  attracting  points,  and 
one  exterior  attractor,  a  stable  limit  cycle.  The 
basins  of  the  two  interior  attractors  wind  out¬ 
wards  and  spiral  onto  the  unstable  limit  cycle  in 
what  we  might  call  a  mosquito-coil  structure  (not 
a  fractal  structure):  trajectories  initialized  just 
inside  the  cycle  will  be  attracted  to  one  or  other 
of  the  fixed  points  depending  sensitively  on  their 
starting  displacement  and  phase.  So  we  see  that 
a  smooth  basin  boundary  can  give  dispersal  to  an 


Fig.  4.  An  indeterminate  cyclic  fold  before  (a)  and  after  (b) 
the  coalescence  of  the  two  limit  cycles.  Window  is  -0.15< 
jr<0.15.  -0.01  <><0.01. 


arbitrary  number  of  alternative  attractors  (an  ex¬ 
ample  involving  six  exterior  point  attractors  will 
be  seen  later  in  fig.  5). 

To  generate  a  significant  level  of  global  unpre- 
dictibility,  we  require  that  a  lot  of  trajectories  be 
involved,  and  it  is  here  that  the  smooth  bound¬ 
ary  does  not  score  highly.  To  get  the  dispersal 
we  are  obliged  to  place  starts  close  to  the  basin 
boundary:  the  relevant  starts  in  R"  must  be 
clustered  around  a  line,  rather  than  filling  an 
area.  So  not  many  trajectories  are  captured  by 
the  dispersing  action,  and  we  assign  the  smooth 
basin  boundary  the  two  points  shown  in  the 
capture-dispersal  diagram  of  fig.  1. 


Fig.  5.  Multiple  dispersal  from  an  indeterminate  cyclic  fold. 
Front  portrait,  p.  =  -0.01;  centre  portrait,  p  =0;  rear  por¬ 
trait.  /i  =  0.02.  Portrait  window.  -.5.5  <  x,  v  <  3.5.  In  the 
front  portrait  the  dotted  band  of  width  2{- p)'  ’  separates 
the  outer  unstable  limit  cycle  (shown  as  a  broken  circle)  from 
the  inner  stable  cycle.  In  the  centre  portrait,  the  critical, 
structurally  unstable  cycle  at  r  =  1  is  indicated  by  double 
arrows. 
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3.3.  The  fractal  basin  boundary 

The  mosquito-coiled  basins  of  fig.  4a  give  a 
high  degree  of  indeterminacy  to  trajectories  fal¬ 
ling  inwards  off  the  unstable  cycle,  because  as 
they  approach  the  cycle  the  alternating  spiral 
tails  of  the  basins  have  vanishing  thickness. 

Fractal  basin  boundaries  [6],  generated  for 
example  in  driven  oscillators  by  homoclinic 
tangling  of  the  invariant  manifolds  of  a  directly 
unstable  saddle  cycle  [7],  achieve  such  a  delicate 
structure  in  a  more  extended  region  of  phase 
space.  It  thus  seems  appropriate  to  assign  to 
them  a  greater  degree  of  capture  in  fig.  1.  How¬ 
ever,  since  their  fractional  dimension  is  less  than 
that  of  the  phase  space,  we  are  still  frustrated  by 
the  fact  that  the  probability  of  a  start  falling 
precisely  on  a  fractal  basin  boundary  is  zero.  As 
with  smooth  boundaries,  fractal  boundaries  can 
involve  the  thin  intertwining  tails  of  two  or  more 
basins,  as  recognised  by  their  placing  in  the 
figure. 

3.4.  Attractor  paths  and  determinate 
bifurcations 

We  introduce  at  this  stage  the  concept  of  an 
evolving  system  in  which  a  control  parameter 
undergoes  a  slow  sweep.  During  most  of  our 
discussion  we  can  imagine  a  parameter  p,  to  be 
either  repeated  stepped  by  small  increments  Ap, 
or  to  be  a  slow  function  of  time.  In  the  latter 
case,  a  parametrized  flow  on  R",  i,  =  p), 

would  be  replaced  by  a  flow  on  R"^',  by  adding 
the  equation  p  =  e,  and  we  notice  that  at  any  p 
the  projection  of  the  R"^‘  vector  field  is 
identical  to  the  R"  field. 

In  phase-control  space  we  now  have  attractor 
paths  with  transients  settling  quickly  towards  a 
path.  Such  a  path  might  be  the  continuous  trace 
of  a  point  attractor,  of  a  periodic  or  quasi- 
periodic  attractor,  or  of  a  chaotic  attractor;  and 
we  assign  to  these  the  increasing  dispersal  mea¬ 
sures  that  we  used  before.  Since  the  capture  is 
now  from  a  basin  in  phase-control  space,  we  give 


them  all  (somewhat  arbitrarily)  a  higher  capture 
measure. 

The  codimension-one  attractor  bifurcations 
that  will  be  typically  encountered  on  such  a 
parameter  sweep  can  be  classified  as  in  table  1, 
following  [8-11].  The  subtle  bifurcations  are 
local  events  that  signal  a  continuous  supercritical 
growth  of  a  new  attractor  path.  They  are  safe,  in 
that  they  do  not  trigger  a  fast  dynamic  jump,  or 
even  an  instantaneous  enlargement  of  the  at¬ 
tracting  set.  They  are,  moreover,  determinate  in 
outcome,  the  new  path  being  uniquely  defined 
even  in  the  presence  of  external  noise. 

So  a  subtle  bifurcation  achieves  no  dispersal  in 
its  own  right.  It  generates,  at  the  most,  a  de¬ 
terminate  change  in  attractor  type,  correspond¬ 
ing  to  a  movement  between  two  of  the  diamond 
symbols  on  fig.  1.  In  terms  of  dispersal,  it  is  not  a 
significant  event  on  an  attractor  path.  The  same 
conclusion  holds  for  the  catastrophic-explosive 
bifurcations  of  table  lb.  These  are  global  events, 
characterized  by  a  sudden,  instantaneous  enlarg- 
ment  of  the  attracting  set,  with  no  jump  to  a 
remote  disconnected  attractor,  and  no  indeter¬ 
minacy  in  outcome. 

3.5.  Indeterminate  regular  bifurcations 

We  arrive,  finally,  at  the  catastrophic-danger¬ 
ous  bifurcations  of  table  Ic  which  are  character¬ 
ized  by  the  blue-sky  disappearance  of  an  attrac¬ 
tor,  given  an  inevitable  jump  to  a  remote  attrac¬ 
tor  of  any  type.  These  are  dangerous  in  that  a 
system  will  experience  a  sudden  fast  dynamic 
jump  to  a  distant  attractor.  With  the  exception 
of  the  saddle-node  fold,  and  the  saddle  con¬ 
nection,  whose  ID  outsets  (centre-unstable  man¬ 
ifolds)  ensure,  generically,  a  unique  outcome  of 
the  dynamic  jump,  these  dangerous  bifurcations 
can  all  exhibit  indeterminate  outcomes.  Such 
indeterminacy  does  not  seem  to  be  well 
documented,  and  we  give  first  two  examples 
from  the  regular  local  bifurcations. 

We  have  seen  in  fig.  4a  how  the  basins  of  two 
(or  more)  attractors  can  accumulate  onto  an 
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Table  1 

Classification  of  the  generic  codimension-one  attractor  bifurcations  in  dissipative  dynamics.  In  this  we  speak  of  a  forward  control 
sweep  as  one  which  generates  instability  or  increased  complexity.  The  |  identifies  equivalent  flow/mapping  forms. 

(a)  Subtle  (i.e.  continuous)  bifurcations 

Local  bifurcations  with  the  continuous  supercritical  growth  of  a  new  attractor  path 
Safe  with  no  fast  dynamic  jump  or  instantaneous  enlargment  of  the  attracting  set 
Determinate  with  a  single  outcome  even  under  small  noise  excitation 
No  hysteresis  with  attractor  paths  retraced  on  reversal  of  control  sweep 
No  basin  change,  with  basin  boundary  remote  from  the  bifurcating  attractors 
No  intermittency  in  the  steady-state  responses  of  the  attractors 

Supercritical  Hopf . (point  to  cycle) 

Other  names;  in  aeroelasticity,  galloping  or  flutter 

Local  bifurcation  with  imaginary  conjugate  pair  of  flow  eigenvalues  A 

Supercritical  Neimark . (cycle  to  torus) 

Other  names:  supercritical  secondary  Hopf  bifurcation 

Complex  pair  of  mapping  eigenvalues  with  |/l|  =  1.  Note  resonances  if  ri’  or  /I'*  =  1 

Supercritical  flip . (cycle  to  cycle) 

Other  names:  supercritical  period-doubling  bifurcation,  subharmonic  resonance 
Local  bifurcation  with  one  mapping  eigenvalue,  /I  =  - 1 

(b)  Catastrophic  (i.e.  discontinuous)  explosive  bifurcations 

Global  bifurcations  with  a  sudden,  instantaneous  enlargement  of  the  attracting  set 
Explosive  enlargement,  but  no  jump  to  remote  disconnected  attractor 
Determinate  with  a  single  outcome  even  under  small  noise  excitation 
No  hysteresis  with  attractor  paths  retraced  on  reversal  of  control  sweep 
No  basin  change,  with  basin  boundary  remote  from  the  bifurcating  attractors 
Intermittency:  supercritical  lingering  in  old  domain,  flashes  through  new  domain 

Flow  explosion . (point  to  cycle) 

Other  names:  omega  explosion 

Locally  a  saddle-node  whose  ID  flow  outset  forms  a  closed  loop 

Map  explosion . (cycle  to  torus) 

Other  names:  mode  locking/unlocking 

Locally  a  saddle-node  whose  ID  mapping  outset  forms  a  closed  loop  or  drift  ring 

Intermittency  explosions . (cycle  to  chaos) 

Other  names:  intermittency,  interior  crisis,  interior  catastrophe 

Locally  a  fold,  suberit  flip  or  suberit  Neimark;  eg.  is  opening  of  periodic  window 

Chaotic  explosion . (chaos  to  chaos) 

Other  names:  interior  crisis,  interior  catastrophe 

Example  is  the  closing  of  a  periodic  window  in  the  logistic  map 

(c)  Catastrophic  (i.e.  discontinuous)  dangerous  bifurcations 

Blue-sky  disappearance  of  attractor  giving  jump  to  a  remote  attractor  of  any  type 
Dangerous  with  a  sudden  fast  dynamic  jump  to  a  distant  unrelated  attractor . 

Determinate  or  indeterminate  in  outcome,  depending  on  global  topology 
Hysteresis  with  original  attractor  not  reinstated  on  reversal  of  control  sweep 
Basin  shrinks  to  zero  (c.2)  or  attractor  hits  boundary  of  a  residual  basin  (c.l)  &  (c.3) 

No  intermittency  (except  for  a  subcritical  form  in  the  saddle  connection) 

(c.l)  The  local  saddle-nodes,  with  (in)finite  residual  basins 

Saddle-node  fold . . (from  point) 

Other  names;  in  elasticity,  limit  point.  Pitchfork  (cusp)  &  transcritical  not  directly  generic 
Local  saddle-node  bifurcation  of  point  attractors  with  one  flow  eigenvalue  A  =  0 
Indeterminate:  can  be  devised  by  simultaneous  saddle  connection,  but  not  generic 

Cyclic  f<^ . ( from  cycle) 

Other  names:  saddle-node,  dynamic  fold,  periodic  fold  (jump  to  resonance) 

Local  saddle-node  bifurcation  of  periodic  attractors  with  one  mapping  eigenvalue  .1  =  +1 
Iruieterminate:  mosquito-coil,  this  (taper,  figs.  4-6.  Tangled  $/n,  figs.  9.  10 
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Table  1  (cont.) 


(c.2)  The  local  subcritical  bifurcations,  with  basins  shrinking  to  zero 

Subcritical  Hopf . 

Other  names:  in  aeroelasticity,  galloping  or  flutter 

Local  bifurcation  with  an  imaginary  conjugate  pair  of  flow  eigenvalues  A 

Indeterminate:  aeroelastic  galloping  in  asymmetric  well,  this  paper,  figs  7,  8 

Subcritical  Neimark . 

Other  names:  subcritical  secondary  Hopf  bifurcation 

Local  bifurcation  with  complex  pair  of  mapping  eigenvalues  with  |A|  =  1 

Indeterminate:  no  example  known  to  author 

Subcritical  flip . 

Other  names:  subcritical  period-doubling  bifurcation,  subharmonic  resonance 
Local  bifurcation  with  one  mapping  eigenvalue,  A  =  -1 
Indeterminate:  no  example  known  to  author 

(c.3)  Global  bifurcations,  with  (in)finite  residual  basins 

Saddle  connection . 

Other  names:  homoclinic  connection,  homoclinic  bifurcation 

Global  homoclinic  connection:  no  mapping  form  because  connection  is  replaced  by  tangle 
Indeterminate:  like  the  fold,  the  ID  outflow  only  allows  non-generic  indeterminacy 

Regular  saddle  catastrophe . 

Other  names:  boundary  crisis,  chaotic  blue  sky  catastrophe 

Chaotic  attractor  hits  saddle  on  smooth  boundary  which  simultaneously  becomes  homoclinic 
Indeterminate:  no  example  known  to  author 

Chaotic  saddle  catastrophe . 

Other  names:  boundary  crisis,  blue  sky  catastrophe 

Chaotic  attractor  hits  accessible  orbit  in  a  previously  tangled  basin  boundary 
Indeterminate:  see  ref.  [1] 


(from  point) 


(from  cycle) 


(from  cycle) 


(from  cycle) 


(from  chaos) 


(from  chaos) 


unstable  limit  cycle  in  a  mosquito  coil.  The 
unstable  cycle  can  be  brought  into  collision  with 
a  surrounding  stable  cycle,  by,  for  example, 
sweeping  parameter  B  in  the  equation 

X-  Bx  +  x^x  -  Ax  +  x^  =  Q  , 

with  A  constant  at  0.01.  This  is  illustrated  in  fig. 
4,  the  two  cycles  of  fig.  4a  at  5  =  0.0076  having 
disappeared  from  the  portrait  of  fig.  4b  at  fl  = 
0.0075,  due  to  their  mutual  annihilation  at  an 
intervening  cyclic  fold.  The  outcome  of  this  local 
bifurcation  is  clearly  indeterminate,  as  described 
by  Abraham  [10]  and  Stewart  and  Ueda  [1] 
based  on  sections  1.9  and  7.3  of  ref.  [12].  The 
system  will  settle,  after  the  fold,  onto  one  of  the 
two  competing  point  attractors:  the  one  chosen 
depending  sensitively  on  the  starting  conditions 
at  the  beginning  of  the  parameter  sweep,  the 
rate  of  the  sweep,  and  any  real  or  numerical 
noise  present  in  the  experiment. 


It  is  easy  to  devise  a  cyclic  fold  that  has  any 
desired  number  of  indeterminate  outcomes. 
Consider  for  example  the  system  described  in 
polar  coordinates  by 

r=-[(r-l)^  +  M](r-3), 

0  =  3  -  r  +  sin(m0) . 

As  fi  changes  from  negative  to  positive,  two 
cycles  of  radii  1  ±  (-/n)''’  collide  at  r  =  1,  giving 
an  indeterminate  jump  from  the  inner  stable 
cycle  to  one  of  the  m  point  attractors  that  are 
equally  space  around  the  circle  r  =  3.  Notice  that 
in  devising  this  scenario  we  have  chosen  the  form 
of  0  to  ensure  that  0  is  always  positive  if  r<2, 
guaranteeing  cycles  whenever  r  =  0:  and  to  en¬ 
sure  that  0  is  purely  sinusoidal  when  r  vanishes  at 
r  =  3  to  give  a  symmetric  pattern  of  alternating 
point  attractors  and  saddles. 

This  scenario  is  illustrated  in  fig.  5  for  the  case 
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Fig.  6.  Multiple  dispersal  from  an  indeterminate  cyclic  fold, 
showing  the  behaviour  of  one  of  the  basins  of  attraction  at 
p.>0.  The  central  disc  pattern  rotates  with  infinite  velocity 
as 

of  m  =  6.  The  last  portrait,  just  after  the  collision 
of  the  two  cycles,  shows  how  the  basins  of  the  six 
attracting  fixed  points  now  pass  through  the  re¬ 
gion  of  slow  dynamics  at  r  =  1  where  r,  although 
now  always  positive,  is  stilt  small. 

It  is  of  interest  to  examine,  heuristically,  the 
rate  of  rotation  of  the  central  disc-like  basin 
pattern  as  ft— >0.  Ihe  inset  of  saddle  D  in  fig.  6 
leaves  the  central  disc  at  A.  The  initial  flow  is 
insensitive  to  the  value  of  ft  =  0,  so  we  can 
suppose  that  the  inset  enters  the  region  of  slow 
dynamics  at  B  with  equal  to  a  constant 

C, .  Similarly,  between  leaving  the  slow  region  at 
C  and  arriving  near  D  where  0  =  0,  we  can  write 
et,  -  0c  —  ^2-  If  is  the  long  time  of  passage 
through  the  slow  region  around  r  =  1  that  de¬ 
pends  sensitively  on  ft.  For  the  canonical  fold, 
i  =  ft  -I-  we  can  write  the  solution  in  terms  of 
a  =  ft''^  valid  for  positive  ft,  as  x  =  atan(at) 
with  x-*<xi  at  =  ii/2a.  The  time  of  passage 
from  X  =  -*  to  X  =  -I- *  is  therefore  ir/a,  scaling 
as  ft~'^^.  So  for  our  cyclic  fold  we  can  write 
/f.  - /g  at  ft“''^  and  assuming  that  0  is  approxi¬ 


mately  constant  in  the  transit  from  B  to  C, 
0c  -  0g  ^  ft  ‘  ■  and  hence  0^  =  -  C,  -  - 

C,ft~‘  ■  giving  finally 

dajdfi  =  (-ic,)fi  '  \ 

So  the  rate  of  rotation  of  the  central  disc  scales 
as  ft"^  ’.  As  we  approach  the  cyclic  fold  from 
positive  ft  the  disc  retains  its  form  but  acquires 
an  infinite  rate  of  rotation  until  it  disappears 
altogether  at  ft  =  0,  since  the  basins  do  not 
penetrate  to  the  interior  for  ft  <  0. 

At  any  fixed  ft  >  0  the  central  pattern  retains 
its  hexagonal  symmetry,  so  there  is  no  great 
sensitivity  to  the  starting  condition  [r(0),  0(0)]. 
For  arbitrarily  small  ft  >  0  we  can  find  a  finite 
circle  in  phase  space  whose  interior  points  all  lie 
in  a  prescribed  basin  But  near  the  bifurcation 
the  unbounded  rate  of  rotation  implies  that  the 
point  attractor  to  which  a  motion  flows  is  infi¬ 
nitely  sensitive  to  the  value  of  ft.  As  ft  decreases 
towards  zero,  the  radius  of  the  largest  sphere  in 
phase-control  space  whose  interior  points  all  lie 
in  one  basin  will  vanish  as  Notice,  however, 
that  this  infinite  sensitivity  to  ft  is  not  reflected  in 
the  response  of  the  evolving  system,  ft  =  e,  since 
the  attractor  to  which  a  non-evolving  trajectory 
would  tend  as  /— >30  has  no  significance  under 
evolving  conditions.  The  characteristics  of  the 
local  2D  and  3D  flows  are  in  fact  very  simple, 
and,  as  we  have  remarked,  they  are  in  no  way 
governed  by  the  invariant  manifolds  of  the  dis¬ 
tant  saddles. 

As  a  second  illustration  of  a  regular  indetermi¬ 
nate  bifurcation,  consider  the  oscillator 

Jt  +  Bx  -  Cx’  +  X  +  (D  -  l)x“  -  Dx’  =  0  . 

which  arises  in  problems  of  aero-elastic  gallop¬ 
ing.  As  the  wind-loading  parameter  B  passes 
from  positive  to  negative  the  equilibrium  at  x  =  0 
loses  its  stability  at  a  sub-critical  hopf  bifurcation 
beyond  which  the  divergence  is  everywhere  posi¬ 
tive  and  all  trajectories  diverge  to  x  =  ±3c.  The 
nonlinear  stiffness  corresponds  to  a  mctastablc 
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well  similar  to  that  in  fig.  7,  parameter  D  allow¬ 
ing  us  to  vary  the  relative  heights  of  the  two 
potential  barriers.  This  could  correspond  to  a 
nominally  symmetric  structural  shell,  buckling  in 
the  presence  of  a  symmetry-breaking  imperfec¬ 
tion.  Taking  the  left-hand  barrier  to  be  the  high¬ 
er,  as  drawn,  it  is  clear  that  if  the  heights  are 
very  different  the  bifurcation  will  be  determi¬ 
nate,  all  motions  from  the  bottom  of  the  well 
tending  to  jc  =  +<».  Conversely,  if  the  heights  are 
nearly  the  same,  the  bifurcation  will  be  indeter¬ 
minate,  with  adjacent  motions  differing  perhaps 
only  in  their  starting  phase,  heading  to  either 
X  =  ±00.  The  critical  condition  separating  these 
two  types  of  Hopf  bifurcation  is  represented  by 
the  saddle  connection  illustrated,  in  which  a 
trajectory  climbing  out  of  the  well  under  the 
positive  divergence  can  just  pass  from  the  lower 
hill-top  to  the  higher  hill-top.  Phase  portraits 


Fig.  7.  Sub-critical  Hopf  bifurcations  in  an  asymmetric 
potential  well.  In  parameter  space  the  determinate  bifurca¬ 
tions,  given  an  inevitable  jump  to  the  right,  are  separated 
from  indeterminate  bifurcations,  giving  jumps  to  either  left 
or  right,  by  the  saddle  connection  illustrated. 


Fig.  8.  Indeterminate  sub-critical  Hopf  bifurcation,  such  as 
might  arise  due  to  aero-elastic  galloping  in  an  asymmetric 
potential  well.  Phase  portraits  show  the  basin  structure  be¬ 
fore  (top)  and  after  (bottom)  the  bifurcation. 

corresponding  to  the  indeterminate  form  of  Hopf 
bifurcation  are  shown  in  fig.  8. 

3.6.  Indeterminate  chaotic  bifurcations 

The  cyclic  fold  of  fig.  5  is  a  mechanism  for 
sweeping  all  trajectories  of  a  phase-control  basin 
precisely  onto  a  smooth  basin  boundary  from 
which  they  can  be  dispersed  to  a  large  number  of 
competing  attractors.  It  is  an  excellent  generator 
of  global  unpredictability.  In  a  driven  system,  a 
cyclic  fold  can  sweep  trajectories  onto  a  fractal 
boundary,  scoring  equally  well  on  capture- 
dispersal,  and  offering  additional  uncertainties 
due  to  the  mixing  action  of  the  associated  chaotic 
transients. 

An  example  of  this  has  been  identified  in  the 
resonance  of  a  softening  system  [2,  3],  using  the 
canonical  escape  equation  of  our  earlier  studies 
[7,  13,  14], 

x  + pi +  x-x^  =  F sin(a)f) . 
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Typical  resonance  response  diagrams  are  shown 
in  fig.  9.  The  top  picture  shows  a  safe  jump  from 
the  saddle-node  A  which  is  determinate,  the 
jump  to  resonance  always  restabilizing  at  R.  The 
third  picture  also  shows  a  determinate  jump 
from  A,  corresponding  now  to  a  direct  escape 
from  the  cubic  well. 

Of  primary  interest  is  the  centre  picture,  in 


Fig.  9.  Three  resonance  response  curves  for  the  canonical 
escape  equation  at  /3  =  0. 1 ,  showing  the  variation  of  the 
maximum  displacement,  Jt„,  with  the  forcing  frequency.  <o. 
The  top  picture,  at  F  =  0.0562  shows  a  safe  jump  to  reso¬ 
nance  from  the  saddle-node,  A.  that  always  restabilizes  on 
the  harmonic  attractor,  R.  The  middle  picture,  at  F-0.08, 
shows  an  indeterminate  jump  that  may  or  may  not  restabilize 
on  R.  The  bottom  picture  at  f  =  0.12  shows  a  determinate 
jump  to  infinity,  out  of  the  potential  well:  this  occurs  once 
the  chaotic  bifurcation  terminating  the  cascade  from  C  has 
passed  to  the  right  of  A. 


which  saddle-node  A  is  indeterminate,  with 
jumps  stabilizing  either  on  the  large  amplitude 
harmonic  oscillation  R;  on  a  coexisting  sub¬ 
harmonic,  Q,  of  order  n  =  3;  or  on  the  attractor 
at  infinity,  P,  having  escaping  out  of  the  well. 
This  indeterminacy  arises  because  at  these  pa¬ 
rameter  values  we  have  a  tangled  saddle-node 
[2,  3],  with  A  located  on  a  fractal  basin  boundary 
as  illustrated  in  fig.  10:  here  the  saddle  and  node 
are  adjacent  black-on-white  circles  near  the  edge 
of  the  grey  residual  basin.  This  grey  basin  will  be 


Fig.  10.  Invariant  manifolds  (top)  and  basins  of  attraction 
(bottom)  just  prior  to  the  tangled  saddle-node.  A.  of  the 
centre  picture  of  fig.  9.  Parameter  values  are  /3  =  0.1.  w  = 
0.83,  F  =  0.0795.  with  the  Poincare  section  defined  at  phase 
lb  =  180”.  White  represents  the  basin  of  the  attractor  at 
infinity,  black  the  basin  of  the  resonant  harmonic  oscillation. 
R.  and  grey  the  (residual)  basin  of  the  destabilizing  node. 
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instantaneously  striated  by  the  basins  of  P,  Q  and 
R,  in  a  manner  similar  to  that  of  fig.  6,  once  the 
saddle  and  node  are  annihilated  at  the  imminent 
cyclic  fold  [15]. 

This  tangled  saddle-node,  an  indeterminate 
bifurcation  associated  with  a  fractal  boundary, 
gives  us  perhaps  the  highest  possible  degree  of 
global  unpredictability,  represented  by  the  upper 
right-hand  point  of  fig.  1.  Similar  indeterminate 
bifurcations  govern  the  escape  from  single-well 
to  cross-well  motions  in  the  twin-well  Duffing 
oscillator  [1,  16,  17];  and  for  most  softening 
oscillators  there  is  a  boundary  crisis  [18]  near  the 
top  of  the  resonance  response  diagram  which  can 
also  be  indeterminate  [1]. 


4.  Concluding  remarks 

We  have  highlighted  some  of  the  phenomena 
that  can  give  high  degrees  of  global  unpredic¬ 
tability  in  nonlinear  dissipative  dynamics.  In¬ 
determinate  bifurcations  are  particularly  severe 
events  in  terms  of  our  qualitative  capture- 
dispersal  criterion.  So  precursor  techniques  that 
spot  the  imminent  approach  of  a  bifurcation  in  a 
slowly  evolving  system  will  be  particularly  useful 
[19]:  and  chaotic  motions  are  helpful  here  be¬ 
cause  their  wandering  trajectories  supply  a  lot  of 
information  about  the  underlying  dynamics  [20, 
21].  Of  course,  none  of  the  phenomena  violate 
the  continuity  of  response  against  starting  and 
control  perturbations,  outlined  in  section  2.  This 
is  true  both  for  the  evolving  and  the  nonevolving 
systems,  but  the  former  can  give  severe  practical 
unpredictability  if  they  involve  a  slow  passage 
through  an  indeterminate  bifurcation. 
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We  demonstrate  how  some  of  thee  common  methods  in  nonlinear  data  analysis  can  be  modified  to  account  for 
nonuniform  time  sampling  by  using  fuzzy  delay  coordinate  reconstructions.  The  effects  of  measurement  noise  and  time 
sampling  errors  are  compared.  We  have  also  developed  a  learning  algorithm  to  find  optimal  fuzzy  delay  coordinate 
reconstructions  of  the  data  to  suit  one’s  goals.  This  can  be  employed  to  find  representations  optimized  for  measuring 
topological  invariants,  forecasting,  control,  or  other  goals. 


1.  Introduction 

The  process  of  observation  and  analysis  is 
fundamental  to  science,  but  has  many  associated 
limitations.  The  most  commonly  discussed  prob¬ 
lems  are  dynamical  noise,  measurement  noise, 
and  small  data  sets.  We  will  address  some  of  the 
unique  aspects  of  data  sets  which  have  an  addi¬ 
tional  problem:  they  are  not  sampled  uniformly 
in  time  or  have  an  uncertainty  in  the  observation 
time.  In  recent  years,  much  work  has  been  done 
to  develop  tools  for  analyzing  nonlinear  systems, 
but  these  techniques  have  not  yet  considered 
nonuniform  time  sampling.  The  development  of 
fuzzy  delay  coordinate  reconstructions  is  the  pri¬ 
mary  result  of  this  work.  This  arises  as  a  natural 
generalization  of  delay  coordinate  state  space 
reconstruction  from  a  time  series.  We  illustrate 
the  implementation  of  fuzzy  delays  with  mutual 
information  and  dimension  calculations,  and  de¬ 
scribe  an  alternate  approach  to  obtaining  optimal 
representations  and  forecasting  which  employs  a 
learning  algorithm. 

'  E-mail  address:  [breeden,n|@complex. ccsr.uiuc.edu 


The  problem  of  nonuniformly  sampled  data  is 
often  ignored  because  it  is  not  as  prevalent  in 
laboratory  experiments  as  measurement  noise. 
In  a  controlled  environment,  slight  uncertainties 
will  always  exist  as  to  when  a  measurement  was 
made,  but  these  errors  are  commonly  negligible. 
Not  all  scientific  endeavors  are  so  fortunate. 
Astronomers,  for  instance,  must  always  contend 
with  fluctuating  weather  conditions  and  competi¬ 
tion  for  equipment  use.  Because  of  this,  obtain¬ 
ing  long  continuous  observations  of  a  single  ob¬ 
ject  can  be  extremely  difficult.  Researchers  in 
many  fields  are  often  under  constraints  which 
make  uniform  sampling  difficult.  We  undertake 
the  current  investigation  to  accommodate  the 
analysis  of  such  data. 

To  develop  techniques  for  nonuniformly  sam¬ 
pled  data,  we  must  distinguish  between  discrete 
time  systems  (dynamics  which  could  be  described 
by  iterative  mappings)  and  continuous  time  sys¬ 
tems  (dynamical  flows  which  could  be  described 
by  differential  equations).  Data  from  either  type 
of  system  might  require  the  techniques  to  be 
presented  here. 

If  a  discretely  timed  system  is  sampled  ran- 
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domly,  the  result  is  a  nonunifomiiy  sampled  time 
series  of  the  kind  we  are  considering.  This  could 
arise,  for  example,  in  observations  of  a  pulsar 
under  adverse  conditions  such  that  not  all  the 
emissions  could  be  observed.  Because  of  the 
potential  for  chaos  in  iterative  systems,  we  can¬ 
not  expect  to  be  able  to  interpolate  between 
observations.  Numerically,  we  will  study  this 
with  iterated  maps  where  not  every  iteration  is 
recorded. 

Continuous  time  systems  present  a  greater 
range  of  scenarios.  The  simplest  situation  is  to 
have  an  uncertainty  associated  with  the  time  of  a 
measurement,  jc(t,  ±  8t,),  similar  to  measurement 
noise,  x(t,)  ±  8x(t,).  We  can  relate  time  sampling 
errors  to  measurement  errors  if  we  assume  the 
dynamics  maintain  a  flow  vector  field  v{x)  with 
probability  density  p(x).  By  a  Taylor  series  ex¬ 
pansion  to  first  order. 


x^(r,  ±  8t,)  =  x;(t,)  ±  xj  8t, 

-x,(t,)±8x/0,  V/ell,D],  (1) 


where  8xy(r,)  =  Uy(Xy(t,))  8t,.  Furthermore,  if  the 
distribution  of  time  sampling  errors  is  given  by 
g(8/)  and  the  system  has  an  ergodic  invariant 
probability  distribution  over  the  state  space  p(x), 
we  can  relate  this  to  a  measurement  noise  dis¬ 
tribution  [1]  of 


P(x)  «?(8f) 
Kdu/dr)  8t| 


X  8{hXj  -  Vi{x)  8t)  dx(8/) . 


(2) 


Of  course,  this  assumes  bt  is  small  relative  to  the 
typical  divergence  time  of  nearby  trajectories  as 
given  by  the  Lyapunov  exponents  [2]. 

We  also  consider  measurements  which  are 
taken  nonuniforrnly  in  time,  t,  i  At,  with  neg¬ 
ligible  uncertainty,  8r  =  0.  This  is  similar  to  the 
problem  for  discretely  timed  systems,  and  is  the 
most  interesting  application  of  fuzzy  delay  coor¬ 
dinates. 

In  the  following  sections,  we  discuss  how 


nonuniforrnly  sampled  data  and  fuzzy  recon¬ 
structions  (section  1)  may  be  implemented  in 
calculations  of  mutual  information  for  finding 
good  coordinates  (section  2),  dimension  for  de¬ 
termining  the  number  of  coordinates  for  a  de¬ 
terministic  reconstruction  (section  3),  optimal 
representations  based  upon  the  experimenter's 
goals  for  analysis  (section  4),  and  nonlinear  mod¬ 
elling  and  forecasting  (section  5).  Finally,  we 
give  an  example  for  the  analysis  of  optical  emis¬ 
sions  of  quasar  B2  1308  -I-  326  (section  6). 


2.  Fuzzy  delay  coordinates 

One  of  the  primary  tools  in  nonlinear  analysis 
is  the  reconstruction  of  the  state  space  from  a 
time  series.  To  acquire  some  basic  information 
about  a  system’s  dynamics,  one  can  plot  the 
available  time  series,  x(t)  versus  derivatives 
(x(0,  x{t),  ■  ■  •),  delay  coordinates  [3]  (x(t  -  t,), 
x(t  -  Tj), . .  .),  or  other  reconstructed  coordi¬ 
nates  [4,  5).  We  are  using  the  state  space  recon¬ 
structions  to  find  structure  within  the  time  series 
where  none  may  be  apparent,  as  in  chaotic 
systems. 

If  the  system  is  well  sampled,  i.e.,  the  ex¬ 
perimenter  knows  that  the  state  of  the  system 
does  not  change  significantly  between  observa¬ 
tions,  interpolation  can  be  used  to  generate  a 
data  set  which  is  uniformly  spaced  in  time.  How¬ 
ever,  for  sparsely  sampled  or  very  noisy  systems, 
interpolation  is  as  likely  to  degrade  as  improve 
the  data.  In  these  situations,  derivatives  and 
most  other  reconstructed  coordinates  also  add 
noise.  We  introduce  fuzzy  delay  coordinate  re¬ 
constructions,  fig.  1,  a  modification  of  delay 
coordinates,  so  as  to  use  the  data  unaltered.  The 
state  space  will  be  generated  using  (x(t).  x(t  - 
tI),  x(f  -  t'),  .  .  .)  where  t'  E  (t,  -  8t,,  -l-  8t,]. 

Rather  than  defining  a  coordinate  by  a  single 
delay  t,  ,  we  are  allowing  for  a  tolerance  window 
of  width  2  8t,  about  t,.  The  point  closest  to  the 
center  of  this  window  will  be  accepted  for  the 
reconstruction.  The  problem  of  which  coordi- 
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Fig.  1.  The  generation  of  fuzzy  delay  coordinates,  t  is  the 
commonly  defined  delay  and  8r  is  the  tolerance  window 
about  that  delay  for  accepting  sets  of  points  into  the  state 
space  reconstruction.  The  point  closest  to  t  within  the  range 
[t  -  8t,  t  +  8t]. 

nates  to  use  and  how  many  is  a  major  topic  of 
this  paper. 

It  is  important  to  realize  that  the  fuzzy  recon¬ 
structions  we  have  introduced  do  not  directly 
smooth  the  data.  The  information  in  the  original 
observation  times  is  still  available,  and  will  be 
exploited  in  modelling  and  forecasting,  (sections 
6  and  7). 

Dynamical  smoothing  of  the  state  space  trajec¬ 
tories  [6]  is  sometimes  a  possibility,  but  this 
requires  first  reconstructing  a  state  space  from 
our  nonuniformly  sampled  data,  which  brings  us 
back  to  the  need  for  fuzzy  reconstructions.  In 
fact,  once  the  learning  algorithm  described  in 
section  3  has  been  applied,  we  may  be  able  to 
smooth  the  original  data  and  interpolate  to  uni¬ 
formly  spaced  measurements. 

3.  Mutual  information 

To  determine  which  time  delays  to  use  in  a 
state  space  reconstruction,  mutual  information 
[7]  is  frequently  employed.  Mutual  information 
is  a  nonlinear  measure  of  how  correlated  are  two 
time  series,  and  is  calculated  from  the  joint 
probability  distributions  as  a  function  of  delay,  t, 
as 


information  introduced  by  a  given  choice  of 
delay.  The  delay  corresponding  to  the  first  mini¬ 
mum  of  M(t)  should  give  a  reconstruction  in 
which  the  coordinates  are  maximally  indepen¬ 
dent  and  avoid  the  loss  of  information  through 
chaos  that  is  potentially  present  in  subsequent 
minima  [8].  Topologically,  the  minima  in  A/(t) 
are  those  reconstructions  for  which  the  dynamics 
are  maximally  spread  in  the  state  space.  This  is 
beneficial  to  calculating  invariants  of  the 
dynamics,  particularly  in  the  presence  of  noise. 

The  calculation  of  mutual  information  has  al¬ 
ways  assumed  data  which  was  uniformly  sampled 
in  time.  Since  perfectly  uniform  sampling  inter¬ 
vals  are  almost  an  experimental  impossibility,  it 
is  appropriate  to  ask  what  tolerance  this  tech¬ 
nique  shows  to  sampling  noise.  Based  upon  eq. 
(2),  we  have  seen  that  the  probability  density  of 
time  sampling  errors,  q{ht),  can  be  related  to  an 
effective  probability  density  of  measurement  er¬ 
rors,  p(8x).  If  the  measurements  {x(t,)}  were 
made  with  n-bit  resolution,  these  errors  will 
cause  a  loss  of 


max(x:)  -  min(x) 

bits  leaving  n'  =  n  -  m  bits  of  information  in  the 
observations.  Since  the  maximum  of  M(t)  equals 
the  bits  of  information  available,  this  will  cause 
M(t)  to  decrease  with  increasing  (8jc').  How¬ 
ever,  this  does  not  suggest  any  dependence  upon 
T.  Therefore,  it  is  reasonable  to  expect  that  while 
the  mutual  information  will  decrease  with  (8jt'), 
the  location  of  the  minima  may  be  preserved. 
We  consider  the  case  of  a  Rossler  system  [9], 

x=  -y-  z  , 
y  =  xA-  0.343_y  , 
i=  1.83  +  (x-9.75)z, 


M(t)  = 


Px(i)Mi  +  r) 
Px(t)Px(,  +  r) 


(3) 


We  are  using  it  to  measure  the  amount  of  new 


sampled  nonuniformly  producing  a  finite  length 
data  set.  By  making  fuzzy  reconstructions,  we 
can  choose  a  width  8t  about  the  delay  t  to 
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maximize  the  amount  of  usable  data  while  pre¬ 
serving  the  location  of  the  minima  in  A/(t).  If  8t 
is  too  small,  we  are  left  with  so  little  data  that 
M{r)  becomes  lost  in  the  noise  due  to  finite-size 
effects.  The  uncertainty  in  the  mutual  informa¬ 
tion  is 


8Af  ^ 


/3(^-2) 


2N 


(5) 


where  /3  is  the  number  of  bins  used  in  the 
calculation  of  the  probability  distributions  and  N 
is  the  number  of  data  points  [10]. 

The  mutual  information  for  a  randomly  sam¬ 
pled  data  set  from  the  Rossler  system  is  shown  in 
fig-  2  as  a  function  of  t  and  8t.  Cross-sections  of 
this  plot  are  shown  in  fig.  3  with  the  error  bars 
included.  From  these  figures,  we  find  that  mutual 
information  is  very  tolerant  to  large  St.  Although 
the  difference  in  heights  between  successive 
maxima  and  minima  decreases  as  St  increases, 
the  locations  of  those  extrema  are  essentially 
unchanged.  In  fact,  larger  St  combined  with  the 
resulting  availability  of  more  data  acts  to  smooth 
the  curve  from  small  St.  This  implies  that  there  is 


Fig.  2.  The  mutual  information  landscape  over  reconstruc¬ 
tion  coordinates  r  and  6t  is  shown  for  the  Rossler  system.  A 
fixed  data  set  of  25  000  nonuniformly  sampled  points  was 
used  so  that  as  6r  increases,  more  data  becomes  available. 
The  ridge  corresponding  to  r  =  St  occurs  because  of  the 
asymmetric  nature  of  the  windows  for  t  <  Sr.  Those  values 
should  be  ignored. 


Fig.  3.  Slices  from  fig.  2  are  shown  with  the  error  bars 
included:  (a)  St  =  0.05,  (b)  St  =  I.O,  and  (c)  St  =  2.0.  As  St 
increases,  the  mutual  information  and  the  statistical  errors 
therein  decrease.  Note  that  even  for  St  =  1.0,  the  location  of 
the  first  minimum  is  preserved.  This  is  a  very  large  window 
since  the  average  time  to  orbit  the  attractor  is  ~6.0. 


an  optimal  St  >  0  which  gives  the  best  informa¬ 
tion  about  the  location  of  the  minima.  The  only 
time  the  position  of  the  minima  is  severely  ef¬ 
fected  is  when  St  >  t,  which  means  that  r  is 
actually  no  longer  the  most  probable  delay  be¬ 
tween  points  in  the  reconstruction. 

We  have  conducted  the  same  studies  with 
iterated  maps,  e.g.  Henon  [11]  and  Ikeda  maps, 
with  similar  results.  The  only  differences  were 
that  St  increases  by  integer  amounts  and  minima 
in  M  are  less  informative  since  the  natural  delay 
to  use  with  most  maps  is  t  =  1 .  The  behavior 
with  St  as  documented  above  remains  un¬ 
changed. 


4.  Dimension 

Mutual  information  gives  us  a  criterion  with 
which  to  choose  reconstruction  coordinates.  By 
computing  the  topological  dimension  of  the  sys¬ 
tem,  we  can  estimate  how  many  of  these  coordi¬ 
nates  are  required  for  a  deterministic  state  space 
representation.  It  has  been  shown  that  if  the 
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fractal  dimension  of  the  manifold  is  m,  an  em¬ 
bedding  of  the  system  can  be  generated  with  n 
:oordinates  if  n  >2m  [12]. 

Many  different  dimensions  can  be  calculated 
:o  characterize  a  system  [13],  but  we  will  focus 
jpon  the  correlation  dimension  [14]  because  of 
ts  efficiency  and  broad  use.  The  same  generali¬ 
sations  carry  over  to  the  other  techniques.  The 
imitations  of  correlation  dimension  calculations 
lave  been  examined  for  measurement  noise  [15] 
ind  sparse  data  sets  [16],  but  have  not  yet  con- 
iidered  nonuniformly  sampled  data. 

The  correlation  dimension  for  a  given  recon- 
itruction  is  determined  by  examining  the  correla- 
:ion  integral, 

=  (6) 

TV 

vhere  0  is  the  Heaviside  function  and  C(r)  is  the 
lumber  of  pairs  of  points  whose  distance  is  less 
han  r.  The  vectors  Xj  are  typically  constructed 
Tom  the  time  series,  x{t),  with  delay  coordi- 
lates.  A  high  dimensional  reconstruction  is  used 
o  fully  unfold  the  dynamics.  If  we  have  a  scaling 
'egion,  C(r) «  r",  over  a  large  range  of  r  then  v  is 
aken  to  be  the  correlation  dimension.  Here 
(gain  we  can  choose  a  delay,  t'  E  [t  -  8t,  t 
It],  with  which  to  construct  a  fuzzy  state  space 
uid  perform  the  calculation.  In  this  case,  we  can 
efer  back  to  the  mutual  information  to  select 
he  8t  which  admits  the  most  data  without  de¬ 
stroying  the  state  space  correlations. 

If  we  consider  a  nonuniformly  sampled  con- 
inuous  system  such  as  our  Rossler  attractor 
example,  we  can  show  experimentally  how  the 
:rror  in  the  sampling  interval  8t,  is  related  to 
neasurement  noise,  fig.  4.  We  extract  the  corre- 
ation  dimension  by  computing  the  slope  of  the 
caling  region.  These  plots  typically  have  at  least 
hree  distinct  regions.  For  very  small  r,  every 
Kiint  appears  to  be  isolated,  so  the  slope  is  1. 
*ast  a  certain  large  r,  the  whole  attractor  is 
inclosed  so  C(r)  is  constant  and  again  the  slope 
s  1.  When  there  is  no  noise  in  the  system,  the 


Fig.  4.  Correlation  dimension  calculations  are  shown  for 
increasing  time  sampling  errors  (a),  and  measurement  noise 
(b).  For  fig.  (a),  the  curves  from  left  to  right  correspond  to 
time  sampling  errors  of  0.0,  0.001,  0.01,  0.04,  0.1,  0.4,  and 
1.0.  For  fig.  (b),  noise  amplitudes  of  0.0.  0.1,  0.4,  1.0,  4.0, 
and  10.0  are  shown.  Note  that  the  regions  of  fractal  scaling 
shrink  as  the  noise  amplitudes  increase.  The  calculation  was 
done  with  a  five  dimensional  embedding  space.  Below  the 
fractal  scaling  region,  the  curves  all  show  a  slope  consistent 
with  noise  in  a  five  dimensional  embedding  space. 

remainder  of  the  curve  defines  the  fractal  scaling 
region  and  gives  v  =  1.9. 

When  noise  of  either  type  is  added,  the  slope 
does  not  change;  but  the  scaling  region  on  which 
it  is  defined  shrinks  as  the  small  scale  structure  of 
the  attractor  is  destroyed.  The  slope  of  the  lost 
region  becomes  equivalent  to  the  embedding 
dimension,  since  the  added  noise  is  uniformly 
space  filling.  The  difference  between  the  two 
cases  is  in  the  rate  at  which  the  fractal  scaling 
region  disappears.  By  plotting  the  size  of  the 
scaling  region  (in  log(r))  versus  the  noise  am- 
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plitude,  we  obtain  a  simple  relation  between 
time  sampling  errors  and  measurement  errors, 
fig.  5. 

Although  nonuniform  time  sampling  and  mea¬ 
surement  noise  have  similar  effects  upon  the 
calculations,  we  have  more  flexibility  when  deal¬ 
ing  with  timing  errors.  Dimension  calculations 
always  require  large  amounts  of  data,  so  if  we 
have  a  very  limited  data  set,  we  should  choose  8t 
as  large  as  possible.  This  allows  us  to  obtain  a 
value  for  v  even  though  the  scaling  region  is 
reduced.  If,  however,  we  have  a  large  amount  of 
data,  we  can  set  8t  very  small  to  increase  the 
scaling  region  and  improve  the  accuracy  of  our 
calculations. 

When  we  consider  nonuniformly  sampled  data 
from  a  perfectly  timed  system,  such  as  an  iter¬ 
ated  map,  we  find  a  very  different  behavior. 
Taking  randomly  sampled  data  from  the  Henon 
map  [11], 

yn  +  \  ~  , 


Noise  Amplitude 

Fig.  5.  The  size  of  the  fractal  scaling  regions  from  fig.  5  are 
graphed  versus  the  noise  amplitude.  The  filled  squares  repre¬ 
sent  time  sampling  errors  and  the  open  circles  are  measure¬ 
ment  noise.  A  simple  fit  to  the  data  shows  that  a  sampling 
error  of  &r  corresponds  to  an  average  measurement  noise  of 
cO.5  (Sr)*”.  The  uncertainty  in  this  exponent  is  large,  and 
under  perfect  statistics  might  be  unity. 


we  find  that  for  a  given  choice  of  delay  coordi¬ 
nate,  T,  the  long  range  fractal  structure  degener¬ 
ates  as  8t  is  increased,  but  the  small  scale  struc¬ 
ture  is  intact,  fig.  6.  This  is  easily  understood  if 
one  considers  the  state  space  reconstruction  re¬ 
sulting  from  the  coordinates  t' E 

[1,3],  i.e.,  t  =  2  and  8t=1.  What  we  actually 
have  is  a  superposition  of  three  separate  recon¬ 
structions,  and 

These  three  reconstructions  have  the  same  topo¬ 
logical  features,  but  they  cover  the  state  space 
very  differently.  This  leaves  the  small  scale  struc¬ 
ture  mostly  undisturbed,  but  the  longer  range 
fractal  scaling  breaks  down. 

These  examples  for  continuous  and  iterated 
systems  show  that  fuzzy  reconstructions  are  an 
effective  method  for  maximizing  the  use  of  the 
available  data  without  disturbing  the  calculation 
of  the  dimension.  In  combination  with  mutual 
information,  this  provides  a  possible  procedure 
for  obtaining  representations  of  nonuniformly 
sampled  data.  For  both  these  statistics,  we  show 
that  coarse  graining  the  time  is  similar  to  mea¬ 
surement  noise,  which  is  not  surprising.  The 
nontrivial  aspect  is  determining  what  coarse 
graining  makes  best  use  of  the  available  data. 


r 


Fig.  6.  Correlation  dimension  calculations  tor  nonuniformly 
sampled  data  from  the  Henon  map.  A  five  dimensional 
embedding  was  used  with  t  =  2.  The  urves  represent  8t  =  0, 
1,  and  2  from  left  to  right.  Note  thut  the  slope  of  the  fractal 
scaling  region  is  preserved,  but  the  size  of  this  region  de¬ 
creases  hi  increases.  The  break  down  for  small  distances,  r.  is 
due  to  poor  statistics. 
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5.  Optimal  representations 

The  analysis  of  the  two  previous  sections  was 
an  attempt  to  find  a  “good”  representation  of 
the  current  state  of  a  dynamical  system.  This 
meant  finding  a  representation  of  the  system 
which  was  deterministic  and  well  spread  in  the 
state  space  so  as  to  reveal  the  details  of  the 
tOfKjlogy.  In  this  section  we  examine  the  concept 
of  an  optimal  representation  and  discuss  modi¬ 
fications  to  a  recently  developed  learning  al¬ 
gorithm  [5]  to  find  optimal  representations  of 
nonuniformly  sampled  data. 

To  determine  what  is  meant  by  an  optimal 
representation,  one  must  consider  the  ultimate 
goal  of  the  representation.  It  is  unlikely  that  any 
siugle  representation  wilt  be  optimal  for  all  pos¬ 
sible  objectives.  We  have  discussed  the  purpose 
for  using  mutual  information;  however,  this  rep¬ 
resentation  will  not  necessarily  be  optimal  for 
other  pursuits,  such  as  forecasting  or  noise  re¬ 
duction  [6],  where  optimization  is  done  with 
respect  to  the  future  state  of  the  system.  If  one’s 
goal  is  forecasting  or  control  [17]  of  the  observed 
variable  only,  the  quality  of  a  reconstruction  of 
any  dimensionality  should  be  based  upon  the 
predictability  of  only  the  observable,  j:(0-  Many 
other  possibilities  can  be  envisioned.  We  have 
proposed  a  technique  for  search  through  the 
space  of  possible  coordinates  to  find  the  repre¬ 
sentation  that  best  suits  the  goals  of  the  ex¬ 
perimenter.  This  employs  a  learning  algorithm 
based  upon  the  genetic  algorithm  [18]  to  find 
optimal  representations,  because  of  the  potential 
complexity  of  the  fitness  landscape  being  sear¬ 
ched.  Since  we  are  searching  through  dimen¬ 
sions,  the  dimensionality  of  the  quality  landscape 
is  not  even  known.  A  search  is  advantageous, 
because  situations  arise  in  which  the  best  D- 
dimensional  representation  is  not  a  subspace  of 
the  best  (D  +  1) -dimensional  representation 
[19].  In  such  cases,  building  a  representation  by 
sequentially  adding  new  coordinates  would  be 
ineffective.  This  algorithm  is  designed  to  search 
through  the  space  of  possible  representations  to 


locate  the  one  best  suited  to  the  stated  goal.  In 
the  present  context,  we  restrict  this  search  to 
fuzzy  delay  coordinates. 

For  the  search,  each  state  space  reconstruction 
is  encoded  in  a  “genome”  where  the  “genes” 
describe  how  each  coordinate  is  generated.  For 
example,  genome  (t,  ,  8t,  ;  Tj,  8x2)  specifies  a 
three  dimensional  representation  (jc(r)-  x(t  +  tJ), 
jc(/ +  T2))  where  tJ£[t, -8t,  ,  t, -I-8t,], 

[tj  —  8x2,  T2 -I- 8x2].  During  the  search,  x,,  8x;, 
and  the  number  of  these  are  all  optimized  (fig. 

7). 

As  mentioned,  many  goals  and  corresponding 
quality  functions  are  possible.  As  an  example, 
we  will  use  local  linear  predictability  as  our 
criterion.  This  is  done  by  modelling  a  D-dimen- 
sional  representation  locally  as 

- -7—0 - =  m,  •  jf  +  6;  ,  V/£[1,D],  (7) 

*1  *< 

where  m,  and  are  free  coefficients  [20],  x  is  the 
fuzzy  delay  vector,  and  under  the 

dynamics.  To  apply  this  “locally”,  we  partition 
the  state  space  using  a  k-D  tree  data  structure 
[21].  We  define  the  quality  of  a  given  reconstruc- 


Search  Algorithm 


Fig.  7.  A  schematic  of  the  learning  algorithm.  The  search 
begins  by  constructing  an  initial  random  population  of  genes. 
These  representations  are  evaluated  according  to  the  quality 
function  and  ranked  by  the  quality,  Q.  The  highest  quality 
members  of  the  population  are  mutated  using  the  genetic 
operators.  These  new  members  are  then  evaluated,  ranked, 
and  so  forth.  Eventually  the  population  converges  to  an 
optimal  representation. 
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tion  as  Q^Ht]  where 

N,  Ml, 


*-l  ;=1  ,=  1 

-[(iii,.x^-6,)(r'-/“)  +  ;r/(r“)]}\  (8) 


and  Nf  is  the  number  of  leaves  and  Af*  is  the 
number  of  data  points  in  the  kth  leaf. 

We  test  this  numerically  upon  the  same  Rdss- 
ler  system  data  discussed  earlier.  In  hg.  8,  we 
show  the  quality  is  a  function  of  t  and  St.  We 
restricted  the  analysis  to  1,000  data  points  to 
illustrate  the  same  interplay  between  maximizing 
the  data  used  and  minimizing  the  time  sampling 
errors  that  was  seen  for  mutual  information. 
Here,  we  find  a  preferred  reconstruction  of  t=>= 
0.9  and  St  =*  0.2.  As  before,  the  values  for  t  <  St 
are  unreliable.  Although  we  show  a  quality  land¬ 
scape  in  T  and  St  for  two  dimensional  reconstruc¬ 
tions,  under  normal  operation  the  program 
would  search  through  this  space  and  in  higher 
dimensions  to  find  the  optimal  representation. 

Note  that  for  this  modelling,  the  observation 


Fig.  8.  The  quality  for  prediction  using  two  dimensional 
reconstructions  with  delay  coordinate  r  and  width  hr  is  shown 
as  a  function  of  these  variables  using  1000  nonuniformly 
sampled  data  points  from  the  Rossier  system  in  fig.  2.  In 
Older  to  collect  enough  data  points  for  the  modelling,  a  value 
ftr>0  was  preferred.  The  bnt  parameters  were  r^O.9  and 
#t=-0.2. 


time  is  exploited  whereas  the  previous  statistics 
ignored  it. 


6.  Modeling  and  forecasting 

To  this  point,  we  have  focussed  our  attention 
upon  the  fundamental  problem  of  how  best  to 
represent  nonuniformly  sampled  time  series 
data.  Now,  we  would  like  to  discuss  briefly  some 
of  the  options  available  for  modelling  and  fore¬ 
casting.  Even  though  some  methods  do  both,  we 
distinguish  modelling  and  forecasting,  because 
we  can  construct  models  which  have  explanatory 
value,  or  do  forecasting  without  improving  our 
understanding  of  the  system’s  dynamics. 

When  we  consider  nonuniformly  sampled  data 
that  we  may  be  unable  to  interpolate,  our  op¬ 
tions  are  severely  limited.  If  a  single,  globally 
defined  model  of  the  dynamics  is  desired,  equa¬ 
tions  of  motion  in  the  form  of  a  coupled  set  of 
ordinary  differential  equations  (ODEs)  may  be 
fit  parametrically  to  the  data  [22].  In  this  tech¬ 
nique  measurements  at  any  time  are  acceptable 
since  the  equations  are  numerically  integrated 
from  to  +  i  to  compare  the  predicted  and 
experimental  values.  This  method  can  be  applied 
when  all  the  variables  in  the  model  equations  are 
measured  experimentally,  or  the  unobserved 
variables  may  be  derived  from  the  observed 
variables  via  the  model  [4].  Obviously,  this  tech¬ 
nique  works  best  when  one  already  has  a 
theoretical  model  for  the  system  with  a  few  free 
parameters.  In  fact,  this  is  one  of  the  few  model¬ 
ling  procedures  in  which  prior  physical  knowl¬ 
edge  about  the  system  can  be  easily  incorporated 
directly  into  the  model. 

While  this  method  utilizes  ODEs  to  generate  a 
global  model,  if  forecasting  is  our  only  objective, 
it  may  be  more  effective  to  construct  several  sets 
of  locally  defined  ODEs.  In  essence,  this  is  what 
was  done  in  the  Rossier  system  example  of  the 
previous  section  where  local  linear  difference 
equations  were  employed  (eq.  (7)).  Whereas 
before  we  were  using  this  to  find  a  state  space 
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representation  of  the  system  to  provide  physical 
information,  we  could  also  use  this  to  predict  the 
future  state  of  the  system.  In  real  time  applica¬ 
tions,  this  procedure  is  typically  far  less  compu¬ 
tationally  intensive  than  the  global  modelling 
procedure. 


7.  Example:  Quasar  emissions 

To  illustrate  the  application  of  these  concepts, 
we  consider  observations  of  the  optical  emissions 
of  quasar  B2  1308  -i-  326.  The  data  set  consists  of 
221  nonuniformly  sampled  measurements  of  the 
B  magnitude  flux  (fig.  9).  A  quick  inspection  of 
the  data  shows  that  it  is  unreasonable  to  consider 
any  interpolation,  since  we  appear  to  have 
dynamics  on  time  scales  short  compared  to  the 
observation  intervals.  This  indicates  a  need  for 
fuzzy  delay  coordinate  analysis. 


Julian  Days 

Fig.  9.  Measurements  of  the  B  flux  of  quasar  B2  1308  +  326 
with  error  bars.  The  time  is  given  in  Julian  days  and  the  flux 
is  mJy. 


Also,  the  data  is  very  sparse  -  so  sparse  that 
mutual  information  or  dimension  calculations 
would  not  be  statistically  significant.  Therefore, 
we  employ  the  learning  algorithm  discussed  in 
section  5  to  learn  a  predictive  state  space  recon¬ 
struction  of  the  data.  Because  of  the  poor  quality 
of  the  data,  we  cannot  find  any  of  the  usual 
invariants;  dimension,  Lyapunov  exponents,  etc. 
Instead,  we  just  wish  to  determine  if  the  data 
comes  from  a  dynamical  system  rather  than  a 
noise  process. 

To  test  for  dynamics,  we  first  compute  the 
quality  of  the  best  learned  representation.  This 
was  simply  0  =  l/rj  where  tj  was  the  estimation 
error  normalized  by  the  error  bars  given  with  the 
data.  To  establish  a  null-hypothesis,  we  take  the 
original  data  and  shuffle  it  in  time,  i.e.,  we  take 
the  original  observation  times  and  measurements 
and  randomly  pair  them  to  generate  a  random¬ 
ized  data  set.  This  shuffled  data  is  given  to  the 
learning  algorithm  to  maximize  the  quality.  We 
repeated  this  process  many  times  and  histo- 
grammed  the  qualities  (fig.  10). 

We  find  that  the  histogram  of  models  for 
shuffled  data  has  a  mean  quality  of  0.011 
and  standard  deviation  of  a-  =  0.0036.  Since  the 
model  of  the  original  data  has  Q  =  0.26,  it  is  69o- 
away  from  the  null-hypothesis.  This  means  that 


0  0.04  0.06  0.12  0.16  0.2  0.24  0.28 

Quality 


Fig.  10.  A  histogram  of  the  qualities  from  the  shuffled  data 
tests  of  the  quasar  data.  The  single  point  at  large  quality  is 
from  the  real  data.  The  separation  between  this  point  and  the 
shuffled  data  null-hypothesis  indicates  a  time  dependence  in 
the  quasar  emissions. 
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with  high  probability,  the  time  ordering  of  this 
data  is  important  for  an  accurate  modelling.  This 
does  not  exclude  a  time-correlated  noise  process, 
but  it  is  the  first  evidence  that  the  data  is  not 
purely  random.  The  astrophysical  implications  of 
the  quasar  analysis  are  discussed  in  greater  depth 
by  Breeden,  Mufson,  and  Packard  [23]. 

8.  Conclusions 

Nonlinear  analysis  is  still  in  its  adolescence.  As 
a  logical  step  in  its  growth,  we  felt  it  necessary  to 
demonstrate  that  two  of  the  more  commonly 
used  tools,  mutual  information  and  correlation 
dimension,  are  applicable  to  situations  involving 
nonuniform  time  sampling.  This,  in  itself,  is  an 
important  question  since  almost  all  data  from 
real  experiments  exhibit  this  to  some  degree.  We 
accomplished  this  by  first  generalizing  the  con¬ 
cept  of  delay  coordinate  reconstructions  to  fuzzy 
reconstructions.  These  results  are  not  surprising, 
but  we  have  also  found  that  an  intelligent  choice 
of  8t  can  make  much  more  effective  use  of  the 
data  than  some  arbitrary  binning.  Specifically,  in 
sparse  data  situations,  choosing  a  large  window, 
St,  about  the  delay  coordinate,  t,  can  improve 
the  accuracy  of  the  calculations  by  increasing  the 
amount  of  usable  data  without  destroying  the 
results. 

Through  analytic  calculations  and  numerical 
studies  of  dimension  calculations,  we  have  dem¬ 
onstrated  the  relationship  between  measurement 
noise  and  time  sampling  errors.  By  comparing 
the  size  of  the  fractal  scaling  regions  we  can 
quantify  the  relationship  between  these  effects 
for  a  given  example. 

Fuzzy  reconstructions  are  also  a  natural  exten¬ 
sion  to  the  learning  algorithm  developed  to 
search  for  optimal  representations.  In  this  case, 
the  learning  algorithm  chooses  the  best  window 
in  delay,  8t,  as  part  of  the  delay  coordinates  to 
facilitate  the  stated  goal.  This  can  be  a  beneficial 
preprocessor  to  calculating  dynamical  invariants 
(like  dimension)  to  get  the  fullest  use  of  the 


data.  In  a  recent  application  of  this  technique, 
optical  emissions  from  a  quasar  were  shown  to 
have  some  predictability  when  compared  to  ran¬ 
domly  shuffled  versions  of  the  same  data  [23]. 
This  is  despite  the  fact  that  the  data  was 
nonuniformly  sampled,  noisy,  and  sparse -too 
sparse  to  apply  mutual  information  or  dimension 
calculations.  It  is  important  to  notice  that  while 
this  modelling  procedure  operated  within  the 
fuzzy  state  space  reconstruction,  the  actual  times 
of  the  observations  were  not  ignored  as  in  the 

mutual  information  and  correlation  dimension 

/ 

examples.  The  times  are  added  information 
which  improve  the  accuracy  of  the  model. 

Finally,  we  commented  that  nonuniform  sam¬ 
pling  need  not  be  an  impediment  to  modelling 
and  forecasting.  Techniques  have  been  de¬ 
veloped  for  constructing  global  models  (ODEs) 
which  are  not  at  all  impaired  by  this  sort  of  data. 
Likewise,  the  learning  algorithm  coupled  with 
k-D  trees  and  local  linear  difference  equations 
(or  ODEs)  is  an  effective  forecasting  method 
which  can  handle  nonuniform  sampling  well. 
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A  method  is  developed  to  obtain  all  possible  pure  signals  which  could  have  been  the  origins  of  a  given  noisy  signal.  The 
method  also  provides  an  explicit  way  to  calculate  power  spectra  of  these  signals. 


1.  Introduction 

Autoregressive  moving  average  (ARMA) 
models  have  been  extensively  used  for  predicting 
the  time  evolution  of  signals  [1].  They  constitute 
both  the  first  approximation  to  use  when  the 
signal  is  not  known  to  be  chaotic,  and  a  general 
method  of  obtaining  the  pattern  of  the  “non 
deterministic”  part  of  a  signal  by  Wold’s  decom¬ 
position  [2].  Recently  the  ARMA  method  has 
been  applied,  in  this  sense,  to  the  analysis  of 
chaotic  data  [3,4].  Such  analysis  should  however 
be  carried  out  with  care.  For  chaotic  and  other 
essentially  non-linear  signals  the  non-determinis- 
tic  part  is  only  guaranteed  to  be  “uncorrelated”, 
while  the  ARMA  approach  implies  “independ¬ 
ence”  [6].  Thus,  e.g.,  an  ARMA  analysis  of  the 
chaotic  logistic  map  x-*bx{\-x)  for  b<4 
yields  some  “patterned”  part  [4],  while  for  6  =  4 
a  completely  “uncorrelated”  generator  (which 
can  be  interpreted  as  white  noise)  ensues.  A 
further  decomposition  of  the  non-deterministic 
part  can  be  based  on  general  non-linear  methods 
[5,7]  or  on  specific  more  direct  methods  [8] 
when  chaotic  attractors  are  known  or  assumed  to 
exist. 

When  a  Box-Jenkins  [1]  or  similar  analysis  is 


carried  out,  a  specific  ARMA  filter  is  obtained  as 
being  responsible  for  “creating”  the  signal  from 
a  random  shock  generator.  Usually  there  ap¬ 
pears  an  added  white  noise  as  a  result  of  measur¬ 
ing  instruments  or  as  an  intrinsic  property  of  the 
process  measured  itself.  The  amount  of  noise 
included  in  the  signal  is  generally  unknown.  This 
noise  brings  about  alterations  in  the  ARMA 
parameters  obtained  [9]  and  hence  also  in  the 
predicted  time  behaviour.  Since  the  exact  mag¬ 
nitude  of  such  noise  is  not  usually  available,  it  is 
evident  that  a  complete  unravelling  of  the 
“pure”  signal  from  the  noisy  one  is  impossible.  It 
would  seem  however  of  advantage  to  have  even 
a  partial  possibility  of  “purifying”  the  signal, 
such  as  a  knowledge  of  all  possible  pure  signals 
which  could  have  played  the  role  of  our  signal’s 
origin  under  the  addition  of  different  amounts  of 
white  noise. 

For  the  ARMA  (2, 2)  case  a  complete  analysis 
was  carried  out  previously  [10].  By  a  non-linear 
transformation  the  loci  of  the  parameters,  for 
all  possible  pure  signals,  were  shown  to  lie  on 
a  straight  line.  The  Fourier  transform  and 
predicted  time  evolution  could  be  calculated. 
Here  we  will  discuss  the  general  ARMA  (m,  /) 
case. 
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2.  Method 

We  consider  a  stationary  and  invertible  signal 
of  an  ARM  A  (w,  /)  type,  I  ^m,  given  by 

A  1  -  -  ^2^' - /.x 

1  D  i>2  ry  D"» 

- P„B 

where  t  is  the  discrete  time,  Z  is  the  “centered” 
signal  (i.e.  Z,  =  Z, -(Z,)), 
q^, .  . .  ,  qi  are  the  AR  and  MA  parameters, 
respectively,  I,  is  the  random  shock  generator 
with  zero  mean  and  variance  <t  ^  and  B  is  the  shift 
operator.  We  assume  that  this  signal  is  composed 
of  an  original  “pure”  signal  X,  with  an  added 
white  noise  B,  with  variance  a\-. 


for  y  =  0,  1, .  .  .  ,  m.  Here  <7,,  =  p,,  =  q„  „  are  de¬ 
fined  as  —1. 

These  itiA  \  equations  for  the  m  +  2  un¬ 
knowns  /  =  1,  2,  .  .  .  ,  w,  af  and  cr^)  define 
the  locus  of  all  possible  original  processes.  These 
equations  are  non-linear  and  rather  inconvenient 
to  solve  directly’'*'.  Since  they  depend  upon  the 
autocorrelations  they  have  appeared  repeatedly 
in  time  series  analysis  [12].  We  present  here  a 
transformation  which  markedly  simplifies  and 
facilitates  the  analysis.  Define: 

m—j  m~i 

S  +  2  P,Pi+i 

u'  =  -  ,  M2  =  - -  ' 

Pm  A)  Pm 


Z,  =  x,  +  b,.  (2) 

The  pure  signal  X,,  has  to  be  of  the  ARM  A 
(m,  m)  type  [llj. 

Since  the  addition  of  white  noise  does  not 
change  the  AR  parameters  [1]  the  form  of  X,  is 

V  —  ^  ~  ^1.0^  ~  P2.o^  —  ...  —  q^  qB  ^ 

t  1  r,  fj  n  n  '  V  / 

\-p^B-pjB - p„B 

Here  q  are  the  pure  signal  MA  parameters  and 
a,  is  the  generator  of  the  pure  signal,  having  a 
variance  o-^.  The  values  q^,  p,,  cr^  are  obtained 
directly  from  the  signal,  on  using  any  ARMA 
program,  and  we  wish  to  estimate  <7^  cr^  and 
which  are  unknown. 

Let  us  denote 

Y=(\-p,B-p,B^ - p„B"')Z,,  (4) 

then,  calculating  E{Y,Y,^^)  and  using  eqs.  (1)- 
(3)  and  the  white  noise  character  of  a,,  B,  and  I,, 
we  obtain 

m-j 

S  Pi.oPi  +  jA) 

/  =  0 

m-/  m-/ 

=  S  PiPi^i  ,  (5) 

i“0 


m-j 

S  PiPi+j 

u’^^— -  ,  7  =  0,1 . m-\, 

a 

T  m 


and 


(6) 


By  inserting  eq.  (6)  into  eq.  (5)  one  obtains  the 
locus  as  a  straight  line  in  m  dimensions: 


u’ =  u’^  +  biu'^  -  u'^) ,  b  =  8/{8-l)  (7) 


We  thus  see  both  the  reduction  of  dimension 
(m  vs.  m  -I-  1  in  eq.  (5))  and  the  linear  character 
of  the  locus  in  M-space.  The  marked  advantage  of 
the  transformation  is  apparent  if  the  evaluation 
of  the  power  spectrum"*^  is  one’s  main  interest. 
This  point  will  now  be  discussed. 


*'  Using  MATHEMATICA,  we  had  difficulties  in  solving 
the  set  of  equations  (5).  For  example,  in  the  case  m  =  4 
discussed  in  fig.  1,  no  solution  could  be  obtained  on  a  386  pc, 
because  of  the  lack  of  sufficient  memory  after  running  for 
over  15  minutes. 

*■  Again,  since  the  spectrum  is  based  on  the  linear  analy¬ 
sis,  it  can  be  considered  as  (an  important)  first  approximation 
to  the  non-linear  approach.  In  the  latter,  "polyspectra”  are 
used  for  handling  higher  comulants.  See  o  ft.  ref.  |7]. 
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Fig.  1.  Comparison  of  spectral  densities,  b  =  -0.005,  -50000. 


3.  The  power  spectrum 


No  direct  subtraction  of  white  noise  from  the 
power  spectrum  is  possible.  However,  the  power 
spectrum  of  the  pure  signal  can  be  obtained 
directly  from  the  points  in  the  w-space.  Thus,  this 
power  spectrum  is  given  by  [1]: 


g{tj)  =  constant  x 


^  ?1.0^  '  '  '  ?m.O^ 

l-p,B - p„B'^ 


B  =  exp(-ia>) . 


(8) 


Using  eq.  (6)  we  can  transform  eq.  (8)  to 


g(to)  =  constant  x  [m”  +  2u'  cos(ft))  +  •  •  • 

+  2m'"''  cos((m  -  -  2cos(ma>)] 

X  [u^  +  2m2  cos(w)  +  •  •  • 

+  2m”  ' '  cos((m  -  1 ) w)  -  2  cos(/rM«>)] '  ‘  .  (9) 

Where  the  constant  is  determined  by  the  nor¬ 
malization: 


0.5 

/  «(/)cl/=l,  f=w/2^.  (10) 


In  fig.  1  we  show  the  results  for  a  demonstration 
case.  We  chose  ARMA  (4, 4)  with  the  following 
parameters:  q\  =  q-i  =  0,  ^2  -1.77,  q^  =  -0.81, 

P\-  Pi  ~  Pi  -  “1-57,  P4  =  -0.64.  We  choose 
b  =  -0.005  to  represent  a  noisy  signal  and  b  - 
—50000  to  represent  a  fairly  cleaned  up  signal. 
We  see  quite  clearly  the  “purifying”  effect. 

It  should  be  noted  that  the  loss  of  invertibility 
or  of  stationarity  can  be  detected  from  the  power 
spectrum  directly,  thus  there  is  no  need  to  trans¬ 
form  back  to  the  q^,  for  this  purpose.  This  is 
achieved  by  detecting  a  zero  (node)  at  a  point  in 
the  spectrum  which  indicates  an  intersection  of 
the  zero  of  the  q^  polynomial  with  the  unit  circle. 
Similarly,  an  appearance  of  negative  parts  in  the 
“power  spectrum”  indicates  a  region  of  M-space 
which  is  not  allowed  (complex  9,  0). 
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We  promote  the  idea  of  using  the  statistical  technique  of  cluster  analysis  in  nonlinear  data  analysis.  The  technique  is 
illustrated  by  using  it  to  help  identify  local  oscillators  in  the  wake  of  the  flow  past  a  cylinder. 


1.  Introduction 

Principal  component  analysis  and  cluster  anal¬ 
ysis  are  two  of  the  most  important  procedures  of 
multivariate  data  analysis.  Principal  component 
analysis  has  already  been  introduced  into  non¬ 
linear  dynamics  [1],  but  to  our  knowledge  cluster 
analysis  has  not.  Cluster  analysis  is  a  statistical 
technique  used  in  many  fields  to  help  identify 
natural  groupings  in  a  set  of  data.  The  idea  of 
cluster  analysis  is  not  new  in  nonlinear  dynamics. 
Indeed,  it  is  central  to  the  algorithms  that  esti¬ 
mate  the  spectrum  of  singularities  of  the  in¬ 
variant  measures  of  fractal  objects  and  that  lo¬ 
cate  unstable  periodic  orbits.  However,  many 
nonlinear  dynamicists  are  unfamiliar  with  cluster 
analysis  as  a  statistical  technique.  The  purpose  of 
this  paper  is  to  bring  this  technique  to  their 
attention  and  illustrate  its  use  by  applying  it  to 
locate  the  positions  of  local  oscillators  in  the  flow 
past  a  cylinder. 

The  plan  of  the  paper  is  as  follows.  In  section 
2  we  describe  the  data  matrix  and  the  operations 
of  principal  component  analysis  and  cluster  anal¬ 
ysis  on  this  matrix.  In  section  3  the  experimental 
system  is  described  and  our  results  are  given  in 


section  4.  In  section  5  we  discuss  our  results  and 
our  conclusions  are  given  in  section  6. 

2.  Multivariate  data  analysis 

There  are  a  large  number  of  texts  that  explain 
in  some  detail  the  techniques  that  can  be  used  to 
analyse  multivariate  data.  Many  such  texts  are 
referenced  in  the  comprehensive  and  practical 
guide,  written  primarily  for  astronomers,  by 
Murtagh  and  Heck  [2]. 

Central  to  the  procedures  employed  in  mul¬ 
tivariate  analysis  is  the  specification  of  a  data 
matrix.  Principal  component  analysis  and  cluster 
analysis  operate  on  this  matrix. 

2.1.  The  data  matrix 

The  data  matrix,  X,  consists  of  n  rows  (called 
the  objects)  and  m  columns  (called  the 
variables).  How  one  specifies  the  data  matrix  is 
of  course  problem  dependent.  For  example,  in 
attempts  to  classify  bacterial  strains  according  to 
the  amount  of  certain  fatty  acids  they  contain 
[3],  the  objects  are  the  strains  and  the  variables 
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are  the  different  fatty  acids.  Each  strain  is  then 
represented  as  a  point  in  an  m-dimensional 
space,  and  principle  component  analysis  and/or 
cluster  analysis  is  used  to  help  discover  natural 
groupings. 

The  objective  of  this  paper  is  to  suggest  that 
this  taxonomic  approach  can  be  used  to  help 
study  the  structure  and  dynamics  in  a  spatially 
extended  system.  In  particular  the  state  of  the 
flow  at  different  locations  in  the  wake  of  a 
cylinder  are  recorded  in  the  form  of  a  time 
series.  Cluster  analysis  is  then  employed  to  iden¬ 
tify  those  parts  of  the  flow  that  are  similar  (or 
dissimilar).  For  simplicity  we  use  the  power  spec¬ 
trum  to  characterise  the  state  rather  than  the 
time  series  itself.  Thus  the  variables  are  the 
frequencies  and  the  objects  are  the  positions  at 
which  the  measurements  are  taken.  For  a  par¬ 
ticular  experiment  this  procedure  may  be  carried 
out  simply  by  comparing  the  different  power 
sp>ectra  visually.  However,  it  is  easy  to  see  that  in 
a  real-world  application  it  is  desirable  to  have  a 
classification  procedure  executed  by  computer, 
since  this  opens  up  the  possibility  of  incorporat¬ 
ing  the  procedures  and  cluster  information  in  an 
automated  control  system. 

2.2.  Principal  component  analysis  {PC A) 

In  principle  component  analysis  one  looks  for 
a  few  (i.e.,  less  than  m)  linear  combinations  of 
the  original  variables  which  account  for  most  of 
the  variance  in  the  data.  This  is  done  by  first 
centering  the  data  matrix  (i.e.,  so  that  each 
column  has  zero  mean).  Then  the  covariance 
matrix,  X^X  (T  denotes  transpose),  is  formed 
and  diagonalized  to  obtain  the  eigenvectors 
(principal  axes)  and  eigenvalues  (variances). 
Sometimes  it  is  desirable  to  scale  each  variable 
to  have  unit  standard  deviation.  If  this  is  done 
then  X^X  is  the  correlation  matrix.  In  our  case 
the  scaling  is  undesirable  since  it  magnifies  the 
noise.  Therefore,  we  use  the  covariance  matrix. 

Typically  the  data  matrix  is  rank  deficient. 
That  is,  some  of  the  variances  are  no  larger  than 


the  variance  of  the  measurement  noise.  Instead 
of  trying  to  identify  the  actual  noise  level,  one 
sometimes  decides  to  keep  only  those  directions 
which  ‘explain’  a  certain  percentage  of  the  var¬ 
iance.  Discarding  the  directions  with  small  var¬ 
iances  is  what  we  call  PCA  filtered  data.  In  many 
problems  this  type  of  filtering  achieves  a  large 
reduction  in  the  dimension  of  the  problem. 

2.3.  Cluster  analysis 

In  cluster  analysis  one  classifies  the  objects 
into  natural  groupings  that  contain  objects  with 
similar  characteristics.  There  are  a  variety  of 
clustering  techniques,  but  we  will  concentrate 
here  on  hierarchical  techniques  [2]  which  operate 
on  a  matrix  D  =  (rf,y)  of  distances  between  the 
points  X,, .  .  .  ,  E  R"".  Here  we  use  the  Eucli¬ 
dean  distance 


and  refer  to  D  as  the  dissimilarity  matrix. 

Once  the  dissimilarity  matrix  is  calculated,  it  is 
scanned  and  the  smallest  dissimilarity  found  (say, 
d,*).  This  defines  the  two  closest  objects.  The 
two  objects  are  combined  (i.e.,  replaced  by  a 
new  object,  i  U  k)  and  the  dissimilarity  matrix 
updated  according  to  an  agglomeration  al¬ 
gorithm.  The  whole  process  is  then  repeated 
until  only  two  groups  of  objects  remain.  By 
keeping  track  of  the  order  in  which  the  objects 
are  agglomerated,  the  group  structure  is  de¬ 
termined. 

Two  problems  remain.  The  first  concerns  the 
choice  of  agglomeration  algorithm,  and  the  sec¬ 
ond  is  deciding  how  many  groups  there  are  in  the 
data. 

Agglomerative  clustering  methods  have  been 
motivated  by  graph  theory  (linkage-based  meth¬ 
ods)  or  by  geometry  (cluster  centre  methods)  [2]. 
We  shall  consider  two  methods  of  agglome¬ 
ration: 

(1)  single  linkage,  where  the  dissimilarity  ma¬ 
trix  is  updated  by  the  rule  =  iTiin(d,^,  d^^); 
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(2)  Ward's  minimum  variance  method,  where 
one  seeks  to  agglomerate  two  clusters  into  one 
such  that  the  variance  of  the  cluster  is  minimum. 

At  each  stage  in  the  clustering  process  we 
know  the  dissimilarity  between  two  objects. 
When  the  output  is  obtained  as  a  dendrogram 
the  links  between  the  objects  and  the  clusters  as 
a  function  of  dissimilarity  can  be  seen.  Sudden 
changes  in  the  value  of  the  dissimilarity  suggest 
the  existence  of  a  natural  set  of  clusters.  It  is 
important  to  note  that  this  grouping  depends  on 
the  clustering  technique  used,  on  the  dissimilari¬ 
ty  measure,  and,  most  importantly,  on  the  inves¬ 
tigator! 

3.  Experimental  system 

The  cluster  analysis  was  carried  out  on  data 
obtained  from  measurements  of  velocity  time 
series  in  the  wake  of  the  flow  past  a  cylinder  (for 
a  description  of  the  phenomena,  see  ref.  [4]). 
Experiments  were  carried  out  in  a  wind  tunnel  in 
the  regime  where  vortices  are  shed  periodically. 
The  experimental  set  up  is  shown  in  fig.  1.  The 

/ 

/ 

/ 

/ 

/ 


Fig.  1.  The  experimental  arrangement.  The  cylinder  is 
1.6  mm  in  diameter  and  76  mm  in  length.  One  end  of  the 
cylinder  was  fixed  to  the  wind  tunnel  floor  and  the  other  was 
free.  The  free  stream  velocity  was  held  fixed  at  1.2  m/s 
(Re  =  130).  The  hot-wire  probe  was  mounted  to  move  on  an 
arc  in  the  plane  of  the  cylinder.  Thus  the  grid  positions  for 
the  measurements  were  distributed  on  two  parallel  arcs.  The 
reference  positions  of  the  probe,  measured  from  the  axis  of 
the  cylinder,  were  6.2S  and  25  cylinder  diameters. 


cylinder  is  1.6  mm  in  diameter  and  76  mm  in 
length.  One  end  of  the  cylinder  was  fixed  to  the 
wind  tunnel  floor  and  the  other  was  free.  The 
free  stream  velocity  was  held  fixed  at  1.2  m/s 
(Re  =  130).  The  expected  shedding  frequency 
for  a  circular  cylinder  of  this  diameter  in  a 
uniform  stream  is  130  Hz  [5]. 

Measurements  were  made  directly  behind  the 
cylinder  using  a  single  hot-wire  probe,  a  Dantec 
55P11,  which  was  controlled  by  a  Dantec  55M- 
series  constant  temperature  anemometer.  The 
probe  was  mounted  to  move  on  an  arc  as  shown 
in  fig.  1.  Thus  the  grid  positions  for  the  measure¬ 
ments  were  distributed  on  two  parallel  arcs  in 
the  cylinder  wake.  The  reference  positions  of  the 
probe  for  each  arc,  measured  from  the  axis  of 
the  cylinder,  were  6.25  and  25  cylinder  diame¬ 
ters.  The  output  of  the  anemometer,  which  was 
not  linearised,  is  a  measure  of  the  flow  velocity 
at  a  point.  The  frequency  response  of  the  probe 
and  anemometer  is  of  the  order  of  several  kHz, 
whereas  the  frequencies  we  are  interested  in  are 
those  below  250  Hz. 

Time  series  were  taken  at  21  points  spaced 
uniformly  along  each  arc  at  intervals  of  2°  from 
—25'’  to  15°  (taking  a  positive  angle  to  be  mea¬ 
sured  away  from  the  tunnel  floor  and  zero  to  be 
the  horizontal).  For  each  time  series  the  sam¬ 
pling  rate  was  500  Hz. 

4.  Results 

As  mentioned  above,  we  cluster  the  time 
series  indirectly  by  applying  cluster  analysis  to 
the  power  spectra  constructed  from  them.  Thus 
in  our  data  matrix  the  objects  are  the  42  power 
spectra  and  the  frequencies  are  the  variables. 
(Before  constructing  the  power  spectra,  the  time 
series  were  adjusted  to  have  zero  mean.)  Time 
series  of  various  lengths,  and  hence  frequency 
resolution,  (512,  1024  and  2048  points)  have 
been  analysed,  and  the  spectra  obtained  are 
qualitatively  similar.  Below  we  describe  our  re¬ 
sults  for  512  point  time  series  (256  frequency 
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values).  Thus  our  data  matrix  X  is  42  x  256.  As 
we  wish  to  compare  the  shapes  of  the  spectra, 
not  the  amplitudes,  each  spectrum  was  normal¬ 
ised  by  the  amplitude  of  its  largest  peak  (hence 
after  normalisation  max(jf;)  =  1,  V/). 

4.1.  Manual  classification 

As  a  first  step,  a  manual  classification  of  the  42 
power  spectra  was  performed.  An  initial  look  at 
the  spectra  showed  that  the  maximum  amplitude 
varied  greatly  from  spectrum  to  spectrum.  The 
largest  peaks  did  not  all  occur  at  the  same  fre¬ 
quency,  but  were  found  to  be  either  20,  56,  87, 
96,  103,  117  or  124  Hz. 

When  considering  the  overall  shapes  of  the 
spectra,  several  characteristic  forms  can  be  seen. 
The  13  groups  suggested  by  the  manual  classifi¬ 
cation  are  shown  in  fig.  2.  In  the  figure  spatial 
positions  1-21  correspond  to  a  probe  reference 
position  of  6.25  cylinder  diameters,  and  for  posi¬ 
tions  22-42  a  probe  reference  position  of  25 
cylinder  diameters.  The  group  number  for  each 
spatial  position  is  given  in  the  enclosed  box  and 
the  corresponding  spatial  positions  are  given  to 
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Fig.  2.  The  groupings  derived  by  visual  inspection  of  the 
power  spectra.  The  group  number  for  each  spatial  position  is 
given  in  the  enclosed  box;  the  corresponding  spatial  positions 
are  given  to  the  left  and  right  of  the  box.  (The  figure  is 
sometimes  called  a  dendrogram.) 


the  left  and  right  of  the  box.  The  results  of  the 
manual  grouping  are  summarized  in  table  1 ,  and 
typical  spectra  are  shown  in  figs.  3-6. 

Having  carried  out  this  manual  cluster  analy¬ 
sis,  it  is  apparent  that  the  classification  of  the 
spectra  is  not  a  straightforward  task  as  difficult 
decisions  have  to  be  made.  For  example,  the 
distinction  between  groups  7  and  8  is  not  at  all 
clear.  Also,  if  we  compare  just  the  main  fre¬ 
quencies  and  not  the  shape,  some  of  the  groups 
could  be  combined.  However,  even  though  this 
manual  classification  process  is  imperfect,  we 
now  have  an  approximation  for  the  groupings 
that  an  automatic  classification  should  produce. 

4.2.  Cluster  analysis  of  the  raw  spectra 

Ward’s  minimum  variance  algorithm  and  the 
single  linkage  algorithm  were  applied  to  the 
normalised  spectra.  Nine-cluster  solutions  for 
each  method  are  shown  in  fig.  7.  Comparing 
these  two  figures  with  each  other  and  with  fig.  2, 
it  is  clear  that  Ward’s  method  produces  clusters 
more  consistent  with  the  manual  classification. 
The  groups  in  the  lower  half  of  the  domain  being 
determined  quite  well,  and  those  in  the  upper 
half  being  formed  reasonably,  if  not  quite  as 
expected.  There  are,  however,  some  errors  in 

Table  I 


Manually  classified  groups. 


Group 

Probe  position 

Main  peaks 
(Hz) 

Sample  spectrum 
shown  in  fig; 

(6.25  dia) 

(25  dia) 

1 

1 

56.87 
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96 

3 
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87 
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25 
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26 

117 
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27-30 

117 
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9-11 

31-32 

117 
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12-13 

33-35 

100-130 
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9 

14-17 

36-38 

124 
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10 

18-19 

39-40 
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123, 126 
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13 
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Fig.  3.  Representative  power  spectra  from  manually  identified  groups  1-4:  grid  points  1.  3,  23.  .S 
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Fig.  4.  Representative  power  spectra  from  manually  identified  groups  5-7:  grid  points  26,  7,  10. 
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the  groupings  as  some  groups  contain  points  in 
both  the  upper  and  lower  half  of  the  domain  and 
this  is  not  consistent  with  the  manual  groupings. 
The  single  link  agglomeration  seems  to  have 
divided  the  domain  into  two  correctly,  but  the 
majority  of  the  groups  are  not  defined  correctly. 
As  a  result  of  this,  only  Ward’s  method  was  used 
in  the  following. 


4.3.  Cluster  analysis  of  the  PCA  filtered  spectra 

We  now  give  the  results  of  the  clustering  using 
the  PCA  filtered  spectra,  and  examine  how  sen¬ 
sitive  the  results  are  to  the  number  of  principal 
axes  retained.  Fig.  8a  shows  the  clustering  based 
on  using  the  4  most  significant  eigenvectors  (54% 
of  the  variance),  fig.  8b  the  7  most  significant 
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Frequency  (Hz) 

Fig.  5.  Representative  power  spectra  from  manually  identified  groups  8-10:  grid  points  34,  16.  18. 

(71%  of  the  variance),  and  fig.  8c  the  14  most  Clustering  algorithms  require  large  amounts  of 

significant  (93%  of  the  variance).  cpu  time.  The  data  compression  achieved  using 

Comparison  of  fig.  8c  with  fig.  7a  shows  that  PCA  significantly  reduces  the  cpu  time  required 

the  groupings  produced  using  14  axes  is  very  to  complete  the  clustering.  However,  the  calcula- 

similar  to  those  obtained  from  clustering  the  raw  tion  and  diagonalization  of  the  covariance  matrix 

spectra.  Hence  it  is  clear  that  the  data  can  be  X^X,  a  256  x  256  matrix,  takes  significantly  more 

compressed  by  a  factor  of  256/14.  cpu  time  than  the  cluster  analysis  operating  di- 


C.T.  Shaw,  G.P.  King  /  Cluster  analysis  of  time  series 


295 


Frequency  (Hz) 


Frequency  (Hz) 

Fig.  6.  Representative  power  spectra  from  manually  identified  groups  11-13:  grid  points  20,  41,  21. 


rectly  on  the  42  x  256  raw  data  matrix.  Never-  4.4.  Cluster  analysis  of  binary  coded  spectra 
theless  in  a  larger  study  the  number  of  objects 

will  increase  while  the  number  of  variables  re-  As  can  be  seen,  the  groupings  produced  by  the 
main  constant.  Thus  at  some  stage,  the  cluster  PCA  filtered  data  was  still  some  way  from  what 

analysis  of  the  PCA  filtered  spectra  will  be  more  was  desired.  At  this  point  we  felt  that  an  im- 
efhcient.  provement  might  be  possible  by  reducing  the 
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(a)  (b| 


information  in  the  spectrum  to  a  binary  code. 
Let  Xj  denote  the  power  spectrum  for  grid  posi¬ 
tion  i.  Then  x,*  is  the  power  in  frequency  bin  k  of 
spectrum  i.  We  now  construct  a  new  data  matrix 
Y  in  the  following  way.  Specify  a  threshold  A, 
and  then  set 

1  ifx,*^/l, 

0  otherwise 

This  effectively  produces  a  spectrum  which  con¬ 
tains  only  the  dominant  frequencies  as  deter¬ 
mined  by  the  value  of  /4,. 

A  little  thought  (or  experimentation)  reveals  a 
problem.  Consider  the  following  set  of  binary 
coded  spectra: 


(a)  Word  (b)  Single  link  jr,  =  (1,  0,  0,  .  .  .  ,  0,  0)  , 

Fig.  7.  Clusters  obtained  from  analysis  of  the  normalised  =  (0,  1 , 0,  .  .  .  ,  0,  0)  , 

spectra  using  (a)  Ward’s  method  and  (b)  the  single  linkage 

method.  —  (0,  0,  0,  .  .  .  ,  0,  1)  . 


4  eigenvectors  7  eigenvectors  14  eigenvectors 

Rg.  8.  Ousters  obtained  using  PCA  filtered  spectra.  The  spectra  were  projected  onto  (a)  the  4  most  significant  directions  (54% 
of  the  variance),  (b)  the  7  most  significant  directions  (71%  of  the  variance),  and  (c)  the  14  most  significant  directions  (93%  of  the 
variance). 
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Fig.  9.  Clusters  obtained  from  the  binary  representation  of  the  powei  spectra  (see  text).  The  groupings,  from  left  to  right, 
correspond  to  thresholds,  A,  =  0.05,  0.1  and  0.2.  As  can  be  seen,  the  different  thresholds  all  yield  slightly  different  groups,  and 
while  none  correspond  perfectly  with  the  manual  groups,  they  are  all  closer  than  the  clustering  of  the  PCA  filtered  spectra. 


Since  we  know  that  these  objects  are  derived 
from  a  power  spectrum,  it  is  easy  to  see  that  y, 
and  ^2  similar,  and  that  both  are  much 
different  fromyj.  However,  when  the  Euclidean 
distances  between  the  y,  are  calculated,  we  find 
di2  =  ^23  ~  ‘^13-  Clearly  something  is  not  quite  as 
it  should  be,  and  the  obvious  conclusion  is  that 
we  need  to  measure  distances  in  a  way  that 
reflects  ‘nearness’  in  frequency.  An  alternative  is 
to  simply  recognize  that  the  frequency  resolution 
is  too  fine.  A  straightforward  remedy  is  then  to 
decrease  the  frequency  resolution  by  integrating 
the  power  in  adjacent  frequency  bins  before 
reducing  to  a  binary  spectrum.  This  was  done 
using  4  adjacent  bins.  Thus  the  spectrum  was 
reduced  from  256  to  64  frequencies.  The  results 
of  the  cluster  analysis  of  the  binary  spectra  are 
shown  in  fig.  9.  The  thresholds,  A,  =  0.05,  0.1 
and  0.2,  all  yield  slightly  different  groups,  and 
while  none  correspond  perfectly  with  the  manual 
groupings,  they  are  all  closer  than  all  other 
attempts.  Thus  we  may  well  have  reached  a  limit 
in  the  accuracy  of  the  method. 


5.  Discussion 

The  choice  of  agglomeration  method  crucially 
affects  the  groups  produced  by  the  clustering. 
For  the  data  investigated  here.  Ward’s  minimum 
variance  method  produces  a  classification  much 
closer  to  the  manual  clustering  than  the  single 
linkage  method. 

By  using  PCA,  the  dimensions  of  the  data  can 
be  reduced  from  256  to  14.  However,  there  is  a 
penalty  to  pay  for  this.  The  calculation  of  the 
covariance  matrix  and  its  diagonalization  takes 
considerable  cpu  time  compared  to  the  time 
taken  to  cluster  even  the  raw  data.  Nevertheless, 
PCA  data  compression  will  become  more  attrac¬ 
tive  as  the  sample  size  increases. 

Clear  improvements  were  obtained  by  repre¬ 
senting  the  spectrum  by  a  binary  code  and  by 
reducing  the  frequency  resolution.  We  conclude 
that  these  crude  qualitative  features  were  close 
to  what  the  visual  system  keyed  on  to  achieve 
the  manual  grouping.  The  difficulty  in  identifying 
just  what  set  of  operations  should  be  carried  out 
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on  the  data  before  applying  the  clustering  al¬ 
gorithm  illustrates  the  basic  problem  faced  by 
designers  of  machine  classifiers. 

6.  Conclusions 

Cluster  analysis  helps  identify  objects  with 
similar  characteristics.  ITiis  helps  to  identify  fea¬ 
tures  that  need  explaining  and  as  a  result  a 
deeper  understanding  of  the  system  will  be 
achieved. 

In  the  present  context  cluster  analysis  helped 
identify  regions  in  space  whose  dynamics  were 
similar.  This  reinforces  the  view  that  the  flow 
may  be  thought  of  as  a  set  of  spatially  coupled 
local  oscillators.  The  cluster  analysis  helps  reveal 
their  position  and  spatial  extent  and  hence  sug¬ 
gests  where,  for  example,  two  probes  are  best 
located  to  study  the  coupling  or  interaction  of 
two  oscillators  in  the  flow.  We  hope  to  carry  out 
such  a  study  in  future  work. 

Another  possible  use  of  cluster  analysis  is  in 
situations  where  it  is  possible  to  employ  arrays  of 
sensors  distributed  over  a  large  spatial  extent.  It 


is  easy  to  imagine  ihat  in  such  situations  the 
main  difficulty  will  be  the  processing  of  the 
information  from  so  many  probes  in  a  reasonable 
length  of  time.  Cluster  analysis  could  be  used  in 
a  preliminary  phase  of  the  data  acquisition  to 
help  determine  which  probes  to  “listen  to”  under 
different  operating  conditions. 
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We  address  the  question:  “Why  is  a  bridge  between  the  two  groups  desirable?" 


1.  Introduction 

It  is  a  well-known  fact  that  deterministic  chaos 
can  generate  data  which  are  apparently  random 
and  it  is  equally  well  known  that  from  time  to 
time  statisticians  have  to  resort  to  simulation  to 
solve  analytically  intractable  problems.  One 
would  therefore  expect  that  there  should  be 
much  common  ground  between  the  nonlinear 
dynamicists  and  the  statisticians.  It  is  interesting 
to  note  that  it  was  only  relatively  recently  that 
the  two  groups  began  to  interact  with  each  other 
in  a  constructive  way. 

Like  other  historical  developments,  there 
probably  does  not  exist  a  unique  explanation  as 
to  why  the  interaction  has  been  so  slow  in  com¬ 
ing.  The  basic  tenet,  written  or  unwritten,  in 
chaos  is  that  randomness  is  associated  with  or 
even  wholly  generated  by  deterministic  chaos. 
On  the  other  hand,  statisticians,  whether  of  the 
frequentist  or  the  Bayesian  persuasion,  accept 
randomness  as  given  and  try  to  live  with  it.  This 
fundamental  difference  in  philosophy  may  ac¬ 
count  for  much  of  the  differences  in  methodolo¬ 
gy  of  the  two  groups.  Personally,  I  believe  that  it 
is  important  to  probe  deeper  and  analyse  the 


sources  of  randomness,  whilst  accepting  that 
there  will  always  remain  a  proportion  of  the 
randomness  which  cannot  be  adequately  ex¬ 
plained.  In  this  sense,  I  believe  that  a  nonlinear 
dynamicist’s  approach  deals  with  the  deeper 
layer  of  randomness. 


2.  Why  is  a  bridge  between  the  two  groups 
desirable? 

I  think  a  bridge  is  most  desirable  for  both 
groups  for  various  reasons.  First,  results  in  one 
area  might  accelerate  or  clarify  development  in 
the  other.  Second,  joint  efforts  might  help  solve 
some  hard  open  problems  common  to  both 
groups.  I  shall  illustrate  these  points  with  some 
examples,  the  choice  and  presentation  of  which 
cannot  perhaps  avoid  subjectivity  completely. 

(i)  In  chaos  one  typically  (though  not  invari¬ 
ably  nowadays)  deals  with  very  large  data  sets;  at 
the  recent  Workshop  at  Warwick  University, 
data  sets  of  the  si/e  well  beyond  10^  were  fre¬ 
quently  mentioned.  On  the  other  hand,  most 
statisticians  typically  deal  with  a  much  more 
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modest  size,  say  lO’.  (This  difference  in  emphasis 
is  not  so  much  a  question  of  equipment  because 
with  modern  computing  e^^uipment  a  ilable  to 
the  statisticians,  the  storage  and  manipulation  of 
large  sample  data  is  not  a  serious  problem.) 
Therefore,  given  the  different  data  environment, 
it  is  perhaps  not  surprising  that  there  are  differ¬ 
ent  methodological  emphases.  For  example, 
most  statisticians  subscribe  quite  vehemently  to 
the  Principle  of  Parsimony.  The  complexity  of 
the  model  must  be  penalised  so  as  to  produce  the 
simplest  model  that  one  can  get  away  with.  There 
are,  however,  sound  scientific  reasons  for  always 
penalising  over-fitting,  whatever  the  sample  size. 
(See,  e.g.,  [15].)  As  a  simple  example,  let  a 
linear  autoregressive  (AR)  model  be  fitted  to  the 
observations  (A',,  A',, .  .  .  ,  A'^),  assumed  Gaus¬ 
sian  and  stationary,  either  by  the  least  squares 
method  or  the  maximum  likelihood  method.  It  is 
well  known  that  the  sum  of  squares  of  the  residu¬ 
als  (RSS)  tends  to  decrease  with  increasing  ‘com¬ 
plexity',  in  this  case  the  number  of  past  values 
upon  which  each  current  observation  is  regres¬ 
sed,  i.e.  tiie  order  of  the  linear  autoregression.  A 
naive  suggestion  would  then  be  to  fit  as  high  an 
order  as  possible.  What  is  the  price?  The  price  is 
that  such  an  over-fitted  model  has  very  low 
predictive  power.  To  quantify  this  statement,  let 
E{Xy^^  -  ^iv+i)^  denote  the  mean  squared  error 
of  (one-step-ahead)  prediction,  where  A'^+j  is 
the  prediction  obtained  from  the  model  fitted  to 
the  data  (AT,, .  .  .  ,  A"^).  It  turns  out  that  under 
general  conditions  for  a  pth  order  AR  model, 
AR(p), 

£(A^.,  -  +  ^)  +  u(^)  ,  (1) 

where  cr^  is  the  minimum  mean  squared  error  of 
prediction  when  the  model  is  known.  Thus  the 
penalty  is  measured  by  the  multiplying  factor 
(1  -t- p//V),  which  increases  with  p.  In  practical 
terms,  ( 1 )  suggests  that  there  will  come  a  time 
when  the  reduction  in  RSS  (note  that  RSS/ 
{N  -  p)  is  an  unbiased  estimate  of  cr^)  will  not 
be  sufficient  to  compensate  for  the  increase  in 


the  penalty  due  to  overfitting,  thus  leading  to  the 
generally  poorer  performance  in  prediction  with 
an  over-fitted  model.  The  above  discussion  is 
quite  heuristic  and  the  interested  readers  are 
referred  to  [15]  and  the  references  therein  for  a 
more  detailed  discussion. 

It  seems  to  me  that  the  dynamical  systems 
community  could  benefit  greatly  from  a  sys¬ 
tematic  adoption  of  the  above  principle.  As  an 
example,  recently  [3]  has  considered  the  estima¬ 
tion  of  the  embedding  dimension  for  a  “noisy" 
dynamical  system: 

X.  =  F{X,_, . A,  J  +  e,.  (2) 

where  F  is  an  unknown  function  and  {e,}  is  a 
sequence  of  martingale  differences  with  un¬ 

known  variance  cr',  and  d  is  an  unknown  positive 
integer  representing  the  embedding  dimension. 
The  objective  is  to  estimate  d  from  the  given 
observations  (A,,...,A/^)  of  an  assumed 

stationary  t>me  series  with  finite  variance  and 
absolutely  continuous  distribution.  As  an  esti¬ 
mate  of  F(2,,  .  .  .  ,  Zj),  we  calculate 

^/v.x,(2r  22’ - 2a) 

d 

sa,  ^^((^,-^.-,)/^^) 

=  ^  (3) 

^X\k{{z,-X,_,)lh^) 

59^1  1 

where,  for  the  present  note,  k  may  be  taken  as 
any  smooth  probability  density  function  with  fi¬ 
nite  absolute  mean,  and  h^  is  called  the  band¬ 
width,  which  controls  the  amount  of  smoothing 
over  the  neighbourhoods  of  the  z,’s.  Note  that 
there  is  some  similarity  between  (3)  and  the 
localized  receptive  fields  in  neural  network.  [12]. 
Let  Ff^  denote  an  analogous  estimate  but  without 
any  deletion.  The  delete-one  procedure  is  actual¬ 
ly  a  rather  subtle  way  of  imposing  a  penalty  on 
overfitting.  Specifically,  it  may  be  shown  [3]  that 
under  quite  standard  conditions  the  following 
scaling  law  holds: 


H.  Tong  /  Bridging  nonliner  dynamics  and  statistics 


301 


X 


that  (6)  implies  and  is  implied  by  the  following 
generalised  spectral  representation  of  the  time 
series: 

-X. 

(4)  ^(0  =  S  <A,(0  Z;  ,  (8) 

'  ''  y=i 


where  a  —  k{o)  and  Op  denotes  the  “little  O  in 
probability”.  Note  that  eq.  (4)  may  be  compared 
with  eq.  (1).  Here  the  penalty  term  is  principally 
due  to  h'^‘^  (note  that  for  sufficiently  large 

N),  which  increases  with  d,  the  embedding  di¬ 
mension. 

(ii)  An  important  tool  in  dynamical  systems  is 
the  singular  value  decomposition  (SVD)  intro¬ 
duced  to  the  chaos  literature  by  Broomhead  and 
King  [1].  The  SVD  is  founded  on  the  Karhunen- 
Loeve  (KL)  expansion.  It  is  perhaps  pertinent  to 
refteat  briefly  the  KL  expansion  here  as  I  believe 
that  further  exploitation  in  the  chaos  study  may 
still  be  possible.  (For  more  detail,  see  e.g.  [16].) 

Let  (A'(t):  a  s  t  (a, /3E1R),  be  a 

mean-square  continuous-time  time  series  and 
Var  X{t)  s  00,  all  t.  Let  p{s,  t)  =  corr(A'(5),  A’(t)) 
denote  the  autocorrelation  function.  Routine 
consideration  of  the  linear  integral  equation 

p 

j  p(s,  t)  ip(s)  ds  =  \tl/{t)  ,  a<t<P  ,  (5) 

a 

where  p{s,  t)  acts  as  the  standard  Hermitian 
positive  definite  kernel,  yields  upon  Mercer’s 
theorem  the  uniformly  convergent  series  (in  both 
variables) 

00 

p(s,  0=2  k^i/t^(s)  iffjit) ,  (6) 

;  =  i 

where  the  A^’s  are  called  the  eigenvalues  (>0) 
and  the  functions  iIj^{s)  are  called  the  eigenfunc¬ 
tions  such  that 

p 

I  *^y(0  i^tCOdt  =  5^*  .  (7) 

a 

A  more  fundamental  result  in  this  approach  is 


where 

p 

Z^  =  A-‘'^  J^(0^d/,  (9) 

a 

so  that 

corr(Z^,  Z*)  =  S,;  .  (10) 

In  statistics  (or  rather  probability),  the  repre¬ 
sentation  (8)  is  commonly  called  the  Karhunen- 
Loeve  (KL)  expansion  after  K.  Karhunen  [9] 
and  M.  Loeve  [10],  although  the  same  expansion 
was  introduced  independently  by  many  others. 
(See,  e.g.,  [16]).  In  particular,  when  tE 
{1,2,. ..,m},  then  we  have  a  finite  collection  of 
random  variables  (A'(l), .  .  .  ,  X(m))  and  the  KL 
expansion  reduces  to  the  well-known  principal 
component  analysis  introduced  by  H.  Hotelling 
in  1933  in  his  study  of  educational  psychology 
[8]. 

It  is  relevant  to  point  out  two  facts,  which  do 
not  seem  sufficiently  widely  appreciated  in  the 
chaos  literature.  First,  stationarity  is  not  neces¬ 
sary  in  the  above  discussion;  only  finite  variance 
is  needed.  This  is  significant  because  almost  all 
the  applications  of  the  KL  expansion  in  the 
chaos  literature  are  (in  my  view  unnecessarily) 
restricted  to  stationary  time  series.  ([6]  is  a 
notable  exception  although  the  authors  still  refer 
to  the  Toeplitz  property  of  the  “covariance” 
matrix.)  Second,  although  it  might  be  quite 
reasonable  to  arrange  the  sample  eigenvalues 
(obtained  from  data)  in  descending  order  A,  > 
A2  ^  .  .  .  and  use  the  ratio  of  the  sum  of  the  first 
few  A’s  to  the  “trace”  as  a  measure  of  the 
amount  of  information  explained  by  the  first  few 
principal  components,  the  statistical  sampling 
properties  of  this  ratio  statistic  is  by  no  means 
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trivial.  See,  e.g.,  [14],  It  might  also  be  worth¬ 
while  exploring  the  use  of  the  Principle  of  Par¬ 
simony  to  the  determination  of  the  number  of 
principal  components  in  the  context  of  chaos. 

(iii)  Modelling  based  on  local  function  approx¬ 
imation  is  witnessing  rapid  development  in  the 
chaos  literature.  As  recognised  by  Farmer  and 
Sidorowich  [4],  Casdagli  [2],  Grassberger  et  al. 
[7]  and  Sugihara  and  May  [13],  the  threshold 
models  introduced  by  me  in  the  late  1970’s  and 
early  1980’s  were  a  precursor  to  this  develop¬ 
ment.  Of  course,  much  has  developed  since,  as 
demonstrated  in  the  above  references.  It  is  per¬ 
haps  worth  remembering  that  the  basic  point  of 
the  threshold  models  may  be  summarised  in  the 
form  of  the  threshold  principle,  which  advocates 
the  local  approximation  over  states.  I  have 
elaborated  this  principle  elsewhere  and  most  re¬ 
cently  in  [15].  The  fact  that  only  one,  two  or 
three  thresholds  were  used  in  the  early  develop¬ 
ment  should  be  seen  in  perspective  because 

(a)  the  choice  was  made  merely  for  computa¬ 
tional  convenience  (SUN  workstations  were  not 
available  in  1980); 

(b)  the  principle  of  parismony  dictated  that 
the  data  sets  analysed  by  me  and  my  associates 
did  not  warrant  too  many  thresholds. 

Of  course,  once  the  threshold  principle  of 
“divide  and  rule”  is  accepted  as  a  useful  concept 
there  is  then  no  limit  to  the  computational  var¬ 
ieties.  Indeed,  the  statisticians  Lewis  and  Stevens 
have  recently  adapted  the  powerful  numerical 
algorithm  of  multivariate  adaptive  regression 
splines  (MARS)  due  to  Friedman  [5]  to  provide 
a  versatile  and  efficient  implementation  of  the 
threshold  principle.  [Further  details  may  be 
found  in  ref.  [11]].  In  fact,  the  area  of  non- 
parametric  time  series  modelling  provides  an 
excellent  common  ground  for  joint  exploration. 

(iv)  To  fully  comprehend  the  intimate  rela¬ 
tionship  between  order  and  disorder,  low  dimen¬ 
sional  attractors  and  high  dimensional  attractors 
and  so  on  is  a  challenge  facing  both  the  non¬ 
linear  dynamicists  and  the  statisticians.  There  is 


much  to  be  gained  if  there  is  closer  collaboration 
between  the  two  groups.  Quite  often  knowledge 
from  both  areas  is  essential  in  order  to  attack  a 
problem  of  common  concern.  For  example  con¬ 
sider  a  deterministic  model 

;ir(0  =  f(^(t  -  1)) .  r=1.2,3 .  (11) 

where  X{t)  £  IR.  Suppose  A'(/)  is  not  observable 
and  instead  we  observe 

Yit)  =  Xit)-i-e{t),  (12) 

where  e{t)  is  the  measurement /observation 
noise.  The  above  set-up  is  very  common  in  the 
chaos  literature.  Now,  from  (11)  and  (12)  we 
may  deduce  that  approximately 

no  =  Pint  -  1))  +  e(0  -  F'{y{t  -  1))  e(t  -  1) , 

(13) 

where  F'  denotes  the  derivative.  If  the  Lyapunov 
exponent  of  (11)  is  positive  so  that  the  de¬ 
terministic  model  is  sensitive  to  initial  condi¬ 
tions,  then  the  stochastic  model  (13)  will  in 
general  tend  not  to  be  invertible  in  the  sense  that 
the  noise  term  e(0  will  not  be  measurable  with 
respect  to  the  sigma  algebra  generated  by  Y{t), 
Y{t—  1), . . .  .  Without  invertibility,  statistical  in¬ 
ference/estimation  based  on  maximum  likeli¬ 
hood  of  any  parametrised  form  of  F  would  be 
extremely  difficult.  The  latter  problem  is  well 
known  in  statistical  nonlinear  time  series  analy¬ 
sis.  (See,  e.g.,  [15],  p.  309.) 

Another  example  of  particular  current  con¬ 
cern,  some  subtleties  of  which  might  have  been 
overlooked  in  the  literature,  has  to  do  with  a 
model  with  dynamic  (i.e.  system)  noise: 

=  F''’(z;i',)-he;">  ,  f=l,2, ...,  (14) 

where 

zr  =  ^'"'(z;"»)  (15) 
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corresponds  to  the  underlying  deterministic 
model  of  interest.  For  simplicity  we  assume  that 
both  Z  and  e  are  real  scalars.  Let  A***’  denote  the 
Lyapunov  exponent  of  the  system  (15),  which  is 
assumed  to  be  ergodic  with  invariant  measure 
induced  by  Similarity,  let  A*'*  denote 
the  Lyapunov  exponent  of  the  stochastic  system 
(14),  again  assumed  to  be  ergodic  with  invariant 
measure  induced  by  Specifically, 

A<"’  =  Jln|dF‘'>(x)/dx|/Li‘"*(djt),  £>0.  Now, 

given  observations  from  (14),  the  obvious  sam¬ 
ple  version  A**^  say  of  A*'*  is  a  natural  estimate  of 
A^^*,  but  not  A^^l  Therefore,  we  plainly  need  to 
correct  A^**  for  bias  if  it  is  used  to  estimate  A*“l 
(A  similar  remark  applies  to  the  Grassberger 
correlation  dimension). 

A  more  fundamental  question  which  does  not 
seem  to  have  been  addressed  is  this;  whilst  it  is 
well  known  that  A^”*  measured  the  exponential 
divergence  of  two  initial  points  in  state  space 
upon  iteration  under  it  is  not  clear  to  me 
that  A^*^  measures  the  “exponential  divergence” 
of  two  initial  distributions,  which  seems  to  me 
the  more  relevant  concept  to  develop. 
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We  identify  the  basic  ingredients  determining  the  structure  of  the  power  spectra  of  non-linear  dynamical  systems  in  a 
hierarchical  order  of  importance.  The  analysis,  performed  with  the  help  of  symbolic  methods,  shows  that  dynamical 
invariants  such  as  topological  and  metric  properties  of  the  symbolic  orbits  explain  the  main  qualitative  features  of  the 
spectra,  whereas  the  coordinate-dependent  values  of  the  observable  itself  represent  a  less  relevant  contribution. 
Consideration  of  simple  dynamical  models  with  increasing  number  of  topological  transition  rules  evidences  the  formation 
of  coherent  structures  (peaks)  and  explains  their  position  and  size.  By  constructing  the  parse  tree  of  the  allowed  symbolic 
itineraries,  it  is  possible  to  estimate  conditional  probabilities  by  considering  orbits  belonging  to  adjacent  tree  levels. 
Accordingly,  a  Markov  transition  matrix  is  obtained  for  each  level  /  and  is  used  to  generate  signals  with  statistical 
properties  which  approximate  those  of  the  actual  one  increasingly  better  for  l—^^.  A  considerable  improvement  is 
achieved  by  recoding  the  original  signal  in  terms  of  variable-length  words  and  by  re-applying  the  above  procedure  to  the 
transformed  signal,  which  is  equivalent  to  a  renormalization  operation  of  the  associated  dynamical  map.  The  accuracy  of 
the  estimates  is  directly  related  to  the  convergence  of  the  scaling  function  for  the  conditional  probabilities.  Analytic  results 
are  presented  for  the  simplest  five  Markov  models  arising  from  piecewise-linear,  continuous,  one-dimensional  maps. 
Numerical  studies  have  been  performed  for  the  logistic  and  Henon  maps  and  for  the  Lorenz  system. 


1.  Introduction 

Nonlinear  dissipative  dynamical  systems  ex¬ 
hibiting  chaotic  behaviour  have  been  mostly 
characterized  by  evaluating  dynamical  invariants 
like  metric  entropies,  Lyapunov  exponents  and 
fractal  dimensions  [1,  2]  by  means  of  time  aver¬ 
ages  over  randomly  sampled  long  trajectories. 
More  traditional  statistical  indicators,  such  as 
power  spectra,  have  been  discussed  only  in  con¬ 
nection  with  phenomena  at  the  border  of  chaos, 
because  of  their  non-invariance  under  smooth 
coordinate  changes.  Recent  developments  in  the 


field,  however,  suggest  reconsidering  such  an 
approach  within  a  more  general  theoretical 
framework.  In  fact,  it  is  possible  to  perform  a 
systematic  hierarchical  modelling  of  nonlinear 
dynamical  systems  by  means  of  symbolic  meth¬ 
ods.  These  allow  decomposing  the  dynamics  into 
sub-processes  which  can  be  associated  with  a  tree 
structure,  so  that  an  importance  ordering  of  the 
relevant  features  is  obtained.  As  a  consequence, 
the  values  of  the  dynamical  invariants  can  be 
estimated  through  ensemble  (“thermodynamic” 
[3])  averages  which  yield  higher  accuracy  than 
the  ordinary  time  averages,  especially  when  the 
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dynamics  folds  phase  space  completely  and  the 
topology  is  simply  described  by  a  full  (n-ary)  tree 
[4],  In  the  generic  case  of  partial  folding,  the 
symbolic  dynamics  is  represented  by  incomplete 
trees.  The  most  efficient  description  is  then  ob¬ 
tained  by  splitting  the  symbolic  signal  into  vari¬ 
able-length  symbolic  strings  [5]  constituting  a 
prefix-free  code  [6].  This  allows  one  to  obtain  a 
renormalized  picture  of  the  dynamics  by  means 
of  a  simple  recoding  process.  In  the  special  case 
of  a  self-similar  signal,  such  as  those  produced  at 
the  period-doubling  accumulation  point  [7]  and 
at  the  quasiperiodic  transitions  to  chaos  [8],  it  is 
possible  to  achieve  an  infinite  renormalization 
automatically. 

In  this  work,  we  apply  the  procedure  intro¬ 
duced  in  ref.  [5]  to  the  resolution  of  the  structure 
of  power  spectra,  showing  that  their  features  are 
determined,  first  of  all,  by  the  topology  of  the 
corresponding  trees  and,  secondly,  by  the  metric 
properties  (probabilities)  of  the  orbits.  This  first 
skeleton  of  the  spectra  is  invariant  under  smooth 
coordinate  changes.  The  final  contribution  is 
constituted  by  the  values  taken  by  the  measured 
observable  along  the  actual  trajectory  (not  the 
symbolic  one),  and  is  obviously  non-invariant.  A 
succession  of  models  is  automatically  constructed 
and  employed  to  reproduce  signals  whose  statis¬ 
tical  properties  approach  increasingly  better 
those  of  the  original  system.  This  is  obtained  by 
evaluating  transition  probabilities  for  blocks  of 
symbols  of  increasing  length.  The  accuracy 
achieved  by  the  resulting  Markov  models  is  di¬ 
rectly  connected  with  the  convergence  properties 
of  the  scaling  function  [9]  for  the  orbit  prob¬ 
abilities.  We  show  that  the  successive  approxi¬ 
mations  quickly  approach  a  limit  curve,  for  sev¬ 
eral  dynamical  systems,  if  the  proper  symbolic 
ordering  and  the  above  mentioned  coding  tech¬ 
niques  are  used. 

This  approach  allows  one  to  understand  the 
formation  of  power  spectra  in  generic  systems, 
whereas  previous  analyses  had  concerned  only 
simple  examples,  such  as  one-dimensional 
piecewise-linear  maps  [10,  11]  and  axiom- A  sys¬ 


tems  [12],  or  particular  phenomena  like  period¬ 
doubling  [7],  intermittency  [13],  diffusion  [14], 
and  “p^^odic  chaos”  [15];  the  decay  of  correla¬ 
tions  in  area-preserving  maps  has  been  investi¬ 
gated  in  ref.  [16].  We  illustrate  our  hierarchical, 
variable-order,  method  by  presenting  numerical 
studies  of  the  logistic  and  Henon  maps  and  of 
the  Lorenz  system.  Finally,  we  show  how  the 
features  of  generic  spectra  (position,  width  and 
height  of  the  peaks)  can  be  explained  by  solving 
analytically  a  series  of  increasingly  complicated 
Markov  models  describing  suitable  piecewise- 
linear,  continuous,  one-dimensional  maps. 


2.  Shaping  of  power  spectra  by  incomplete 
phase-space  folding 


Dissipative  chaotic  systems  exhibit  power 
spectra  characterized  by  broadened  peaks  whose 
position  and  height  are  the  effect  of  the  complex 
stretching  and  folding  mechanism  acting  on 
phase  space.  The  spectra  of  conservative 
dynamical  systems  present,  superimposed  on  the 
continuous  background,  sharp  peaks  which  are 
usually  originated  by  motion  in  the  vicinity  of 
invariant  tori  [17].  Although  some  of  these  fea¬ 
tures  might  be  intuitively  attributed  to  the  in¬ 
fluence  of  unstable  periodic  orbits  whose  neigh¬ 
bourhoods  are  visited  by  the  trajectory,  no  caus¬ 
al  relation  can  be  identified,  in  general.  Previous 
investigations,  indeed,  have  been  based  on  quite 
a  wide  range  of  different  mathematical  ap¬ 
proaches  and  no  unique  interpretation  scheme 
has  emerged.  In  this  section  we  review  some  of 
the  major  difficulties  in  a  qualitative  way,  before 
going  to  a  more  systematic  treatment  of  the 
problem.  Consider,  for  example,  the  Henon  map 
[18] 

y„^,)  =  ia-  xl  +  hy„,x„)  (2.1) 


at  standard  parameter  values  (a  =  1.4.  b  =  0.3). 
In  fig.  1,  we  show  the  power  spectrum 


s(n  = 


N-  I 


2  x„  e’" 


(2.2) 
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Fig.  1.  Natural  logarithm  of  the  power  spectrum  S{f)  for  the 
jc -coordinate  of  the  Henon  map  at  (a  =  1.4,  h  =  0.3),  versus 
the  frequency  /.  The  positions  corresponding  to  periodicity  3, 
4,  7  and  8  are  indicated  by  dotted  lines. 

for  the  jc>coordinate,  versus  the  frequency /=  k/ 
N  (k  =  0,  1, . , . ,  N/2),  computed  by  averaging 
over  10^  single  spectra  obtained  by  Fourier  trans¬ 
forming  signals  of  length  N  =  4096  iterates.  The 
area  under  the  curve  has  been  normalized  to  1 
and  the  natural  logarithm  of  5(/)  has  been  taken 
after  subtraction  of  the  zero-frequency  compo¬ 
nent.  Although  a  few  peaks  seem  to  be  related 
to  integer  periods  (/=  | ^ ,  g ),  others  are  either 
displaced  from  the  “expected  position”  (as  the 
one  to  the  left  of /=  corresponding  to  period- 
two)  or  close  to  frequencies  corresponding  to 
non-existing  p>eriods  the  lowest-order 

unstable  periodic  orbits  of  the  map,  in  fact,  have 
lengths  1,  2,  4,  6,  7  and  8.  Notice  that  no  distinct 
feature  emerges  in  5(/)  which  can  be  trivially 
associated  with  a  period-six  orbit. 

Furthermore,  the  amplitudes  of  the  peaks  can¬ 
not  be  explained  in  terms  of  the  time  spent  by 
the  chaotic  trajectory  in  the  neighborhood  of  the 
“corresponding”  periodic  orbit.  Indeed,  by  using 
a  generating  partition  [19],  one  can  label  phase- 
space  regions  with  integers  and  study  the  occur¬ 


rence  probability  of  all  symbolic  sequences  pro¬ 
duced  by  the  system.  This  investigation  shows 
that  no  simple  relationship  exists  between  the 
amplitudes  of  the  peaks  and  the  probability  of 
symbolic  sequences  of  the  corresponding  length 
which  are  periodically  extendable  [20].  For  ex¬ 
ample,  the  unstable  period-two  orbit  of  the 
Henon  map  can  be  labelled  by  the  sequence  01, 
whose  probability  is  roughly  twice  that  of  se¬ 
quence  0111,  which  labels  a  period-four  orbit: 
however,  the  respective  peak  heights  in  5(/) 
differ  almost  by  a  factor  six. 

Finally,  the  connections  between  average  ex¬ 
pansion  rates  and  long-time  decay  of  correlation 
functions  discussed  in  ref.  [10]  do  not  hold  exact¬ 
ly  in  generic  systems  (for  related  investigations, 
see  refs.  [11,  21]).  As  a  consequence,  the  widths 
of  the  peaks  cannot  be  simply  explained  in  terms 
of  eigenvalues  of  unstable  periodic  orbits. 

In  this  work,  we  show  that  the  structure  of  the 
spectra  is  to  be  attributed  primarily  to  the  incom¬ 
pleteness  of  the  folding  mechanism  in  phase 
space:  this  phenomenon  is  ubiquitous  in  physical 
systems  without  simple  symmetries.  In  fact, 
maps  whose  symbolic  dynamics  can  be  repre¬ 
sented  on  complete  n-ary  trees  (such  as  the 
Bernoulli  shift  or  the  tent  map  and  conjugated 
transformations  [22])  have  simple  spectral  fea¬ 
tures:  5(/)  is  either  white  or  has  a  broad  peak 
(not  necessarily  Lorenzian)  around  /  =  0.  The 
existence  of  forbidden  orbits  (pruned  branches 
on  the  symbolic  tree)  is  responsible  for  the  ap¬ 
pearance  of  peaks  at  generic  frequencies:  no 
symmetry-induced  “cancellations”  occur  in  the 
Fourier  transform  of  the  time-signal.  In  the 
Henon  map,  e.g.,  only  one  period-one  orbit 
(label  1)  belongs  to  the  attractor:  the  motion  in 
its  vicinity  is  not  “compensated”  in  eq.  (2.2)  by 
any  contribution  coming  from  the  neighbour¬ 
hood  of  the  other  fixed  point  (label  0),  which  lies 
at  a  finite  distance  from  the  strange  attractor. 
Finally,  another  important  aspect  of  the  folding 
process  is  the  amount  of  rotation  of  a  phase- 
space  element  upon  the  action  of  the  dynamical 
law:  in  ref.  [10]  it  was  pointed  out  that  the  sign 
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of  the  local  expansion  rates  (giving  rise  to  either 
order-preserving  or  order-reversing  transforma¬ 
tions,  in  the  one-dimensional  case)  played  a  rel¬ 
evant  role  in  the  behaviour  of  the  correlation 
functions.  These  points  will  be  discussed  in  detail 
in  the  following  sections. 

3.  Hierarchical  modelling 

Let  the  state  of  the  system  be  represented  by  a 
point  jr  in  a  d-dimensional  phase  space  X.  If  the 
dynamics  is  governed  by  a  continuous-time  trans¬ 
formation,  the  procedure  to  obtain  a  hierarchical 
description  of  it  can  be  started  only  after  having 
discretized  time  with  the  help  of  a  Poincare 
section  S  [23]  in  X.  This  must  be  done  for  both 
differential  flows  and  experimental  (scalar)  time 
series  (af(T),  jc(2t),...}  (where  the  sampling 
time  T  is  much  smaller  than  the  average  recur¬ 
rence  time  T  on  I^).  In  the  latter  case,  the 
trajectories  must  first  be  reconstructed  in  a  suit¬ 
able  embedding  space  [Ij.  The  time  evolution  on 
S  is  hence  governed  by  a  mapping  of  the  form 

*n^i=AX„),  (3.1) 

where  n  is  an  integer  and  f:S—*Xisa  generally 
unknown  nonlinear  function.  We  then  introduce 
a  partition  3>  of  X  consisting  of  a  finite  number  r 
of  disjoint  subsets  d,  (i  =  0,1, . . .  ,  r  -  1);  i.e., 
Aj  r\Aj  =  0,  for  /  5^  y,  and  X  =  U  .rd  4.  We  fur¬ 
ther  assume  that  the  transformations  /  admits  a 
natural  invariant  measure  m  [1],  so  that 
/n[/"'(d,)]  =  m(d,),  for  all  d,  £  2),  where  /'* 
denotes  the  inverse  of /.  A  generic  orbit  w  =  {jc„, 
x^, ...  ,x„}  visits  various  elements  d€  2.  De¬ 
noting  with  the  symbol  =  {0, .  .  .  ,  r  -  1} 

(where  Aq  is  the  alphabet)  the  index  of  the 
domain  d  visited  at  time  n,  the  trajectory  w  is 
mapped  to  the  symbolic  sequence  5  =  5„s,  .  .  .  s„. 
It  is  important  to  notice  that  sequence  S  can  be 
produced  (in  n  iterates)  only  by  the  points  x„ 
which  belong  to  the  intersection  d^  =  d,  D 
f~'{A,^  n  . .  .  n  (/"  being  the  «th  iter¬ 

ate  off)  [19].  Since  the  map  admits  an  invariant 


measure  m,  the  signal  =  .  .  .  5„5|5,  ....  pro¬ 
duced  by  infinitely  iterating/,  is  stationary.  The 
probability  P{S)  of  each  finite  subsequence  S  can 
then  be  evaluated  as  the  frequency  of  occurrence 
of  S  in  The  normalisation  is.  as  usual,  2jsi  „ 
P(5)=l,  V/i,  where  |5|  denotes  the  length 
(number  of  symbols)  of  S.  Obviously,  P{S)  = 
in{A^):  i.e.,  the  probability  of  a  sequence  equals 
the  mass  contained  in  the  phase-space  region 
with  the  same  label.  Therefore,  symbolic  strings 
5  with  increasing  length  |5|  =  n  identify  smaller 
and  smaller  sets  in  X  and  their  probability  de¬ 
creases  accordingly.  The  collection  of  all  admiss¬ 
ible  two-symbol  strings  indexes  the  elements 
djn/“'(d, )  of  the  first  refinement  [23]  2?,  of 
the  partition  Q)  under  /;  three-symbol  strngs 
label  the  second  refinement  2,,  and  so  on.  If 
every  infinitely  long  symbol  sequence  corre¬ 
sponds  to  a  single  point,  the  partition  2  is  called 
generating  [23]  and  the  study  of  the  symbolic 
signal  is  “equivalent”  to  that  of  the  real 
trajectories  of  the  system. 

The  phase-space  dynamics  given  by  eq.  (3.1)  is 
translated,  in  the  space  of  all  bi-infinite  se¬ 
quences  over  the  alphabet  A,,,  into  a  dynamical 
process  &  called  shift  map,  which  is  defined  by 
^(  .  .  .  •  5,5, .  .  .  )  =  .  .  .  5,  •  5,5, .  .  .  (where  the 

extra  dot  denotes  an  arbitrary,  fixed  origin).  It  is 
also  useful  to  define  the  set  A,*  of  all  finite 
sequences  over  A„  which,  for  the  binary  alphabet 
{0,  1},  reads  A*  =  {0,  1,  00,  01.  10.  11.  000. 
001,...}.  Of  course,  a  generic  signal  need 
not  contain,  as  subsequences,  all  elements  of  A*. 
For  example,  in  most  natural  languages  one 
never  encounters  more  than  three  consecutive 
consonants;  the  letter  “q”  is  usually  followed  by 
the  “u”;  in  a  musical  score  abrupt  changes  from 
high  to  low  notes  (or  vice  versa)  or  among 
different  tonalities  are  avoided,  as  well  as  long 
repetitions  of  a  single  note.  The  set  if’(6^„)  of  all 
admissible  subsequences  of  is  called  the  lan¬ 
guage  and  is  usually  properly  contained  in  A*. 
The  set  is  invariant  under  er:  i.e.,  = 

i^(<T(y„)).  Each  allowed  string  5  corresponds  to 
a  succession  of  |S|  enlargements  in  a  non-empty 
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region  of  phase  space.  The  pair  (if,  &)  consti¬ 
tutes  a  symbolic  dynamical  system  [23]. 

According  to  the  previous  discussion,  we  as¬ 
sume  that  the  signal  of  length  is 

stationary.  Hence,  we  estimate  the  probability 
P(S)  of  each  subsequence  5,  for  |S|  = 
1,2, . .  .  ,  of  course,  in  order  to  achieve 
reliable  statistics,  it  is  necessary  that  the  number 
of  a  priori  possible  sequences  (with  length 
'*max)  be  much  smaller  than  j5^„j.  The  aim  of  the 
investigation  is  to  furnish  a  succession  of  models 
approximating  the  unknown  dynamics,  in  such  a 
way  that  average  properties  like,  e.g.,  correla¬ 
tion  functions  are  accurately  reproduced.  These 
models  are  first  derived  for  symbolic  sequences 
in  A Q  and  then  “dressed”  with  the  actual  coordi¬ 
nate  values  jr  in  phase  space.  Since,  in  general, 
the  dynamics  folds  phase  space  incompletely,  not 
all  a  priori  possible  concatentations  of  the  sym¬ 
bols  j,  E  are  produced  (i.e.,  is  a  subset  of 
A 5).  Several  sequences  are  forbidden.  For  exam¬ 
ple,  consider  a  binary  partition  (2  =  AoU4,, 

n  A,  =  0)  and  a  system  for  which  no  point  x 
belonging  to  A(,  is  mapped  back  to  A^,  itself  in  one 
step  (i.e.,  /(Jo)  Al  “  ^)-  ^be  sequence  5  =  00 
cannot  occur  and  the  only  possible  continuation 
after  symbol  0  is  1.  Therefore,  element  A,,  can  be 
readily  renamed  as  4o,  and  the  symbolic 
dynamics  yields  concatenations  of  the  “words” 
w,  =  1  and  Wj  =  01.  Such  phenomena  occur,  in 
nonlinear  dynamical  systems,  at  generic  parame¬ 
ter  values:  for  example,  the  symbolic  signal  pro¬ 
duced  by  the  logistic  map 

(3.2) 

in  a  finite  region  around  a  =  1.85,  with 

=  [1  -t-  sgn(x„)]/2,  can  be  rewritten  completely 
in  terms  of  the  three  words  w,  =  1,  =  01  and 

^3=001.  Not  all  combinations  of  them,  how¬ 
ever,  are  allowed:  at  a  =  1.85,  WjW,,  w'jtVjW'j, 
WjW2iV2**'ftVjW2, .  .  . ,  are  not.  The  list  of  such 
irreducible’*"  forbidden  words  for  the  Henon 

*'  A  forbidden  word  is  irreducible  if  it  does  not  contain 
another  forbidden  word. 


map  (2.1)  at  (a  =  lA,  b  =  0.3)  begins  with  (KKK). 
0010,  0110,  OIOKXK),  01 1 1000, .  .  .  etc.  [24],  In 
both  these  cases  it  is  believed  that  there  is  an 
infinite  number  of  prohibitions.  If.  instead,  the 
(sub)shift  &  is  specified  by  a  finite  list  of  blocks 
which  may  not  appear  in  the  signal,  it  is  said  to 
be  of  finite  type  [23]. 

Several  families  of  formal  languages  have  been 
identified  so  far,  the  simplest  class  being  repre¬ 
sented  by  the  so-called  regular  languages  [25]. 
They  include,  in  addition  to  the  subshifts  of  finite 
type,  the  sofic  systems  [26]:  the  typical  example 
is  provided  by  the  logistic  map  at  the  period-two 
band-merging  point  [22]  which  yields  forbidden 
words  of  the  form  0(11)"0,  VnsO,  where  w" 
indicates  the  nth  consecutive  repetition  of  »v. 
Regular  languages  can  be  described  by  finite 
automata  [25]  which  produce  symbolic  strings 
according  to  a  sequential  (usually,  stochastic) 
mechanism.  Three  examples  of  these  are  pre¬ 
sented  in  section  9  (figs.  9a,b,c).  Higher-order 
generation  schemes  are  defined  by  means  of 
parallel  mechanisms,  called  grammars,  which 
transform  symbols  in  a  string  5  into  words 
chosen  from  a  list  W=  {w,,  w,, .  .  .}.  For  exam¬ 
ple,  a  substitution  over  the  binary  alphabet  A„  = 
{o,  b)  may  be  specified  by  a  rule  of  the  type 
(o^  iff(a)  =  tv,  =  ab\  6— »■  i/>(6)  =  w,  =  bba):  this 
yields,  upon  application  to  the  string  S  =  ab, 
S'  =  ^{ab)  =  abbba.  These  “grammatical  rules” 
may  depend,  in  general,  on  a  number  of  nearest- 
neighbours  of  the  symbol  to  be  rewritten  and  be 
either  deterministic  or  not  [25].  Particularly  rel¬ 
evant  for  nonlinear  dynamical  systems  are  the 
two  transformations  (0->01;  1— »10)  and 
•/'op-  1— »01)  which  model  the  period¬ 

doubling  (PD)  accumulation  point  dynamics 
[7]'"^  and  the  golden-mean  quasiperiodic  (OP) 
transition  to  chaos  [8],  respectively. 

In  general,  the  optimal  description  of  a  sub- 

*’  The  actual  symbolic  dynamics  of  the  logistic  equation  at 
PD  is  described  by  a  different  transformation  which  is. 
however,  completely  equivalent  to  the  more  symmetric  one 
considered  here  (called  Morse-Thue  substitution  and  first 
considered  in  ref.  |27)  in  connection  with  PD). 
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shift  dynamics  may  be  given  by  a  mixture  of 
sequential  and  parallel  mechanisms.  In  our  ap¬ 
proach,  this  is  searched  for  having  at  disposal 
only  the  signal  5^0,  since  we  do  not  assume 
knowledge  of  the  actual  dynamical  map  (3.1) 
(and  of  the  partition  .S ).  It  is  then  clear  that  even 
just  the  identification  of  the  basic  words  (01  and 
10  for  PD,  1  and  01  for  QP)  involved  in  the 
generation  process  is  a  rather  difficult  problem, 
which  may  be  even  undecidable  [25]  in  some 
cases:  a  few  results  about  recognizability  of  sub¬ 
stituted  words  in  grammatical  productions  of 
known  origin  can  be  found  in  ref.  [28].  In  gener¬ 
al,  parallel  rewriting  rules  are  expected  to  gener¬ 
ate  signals  which  are  most  difficult  to  describe  by 
means  of  sequential  models. 

These  simple  examples  suggest  that,  in  order 
tc  achieve  a  condensed  representation  of  &,  it  is 
useful  to  decompose  the  signal  into  a  succession 
of  suitable  “primitive”  words  w,,  Wj, .  .  .  ,  poss¬ 
ibly  having  different  lengths,  which  generate  (a 
superset  of)  the  whole  language  f£  upon  concate¬ 
nations  wfw^ . . .  with  one  another.  They  consti¬ 
tute  a  code  [6,  5]  and  their  number  may  be  either 
finite  or  not,  depending  on  the  nature  of  5^0  and 
on  the  criterion  adopted  for  their  definition. 
They  should  be  chosen  in  such  a  way  that  the 
most  compact  description  of  the  dynamics  is 
obtained.  For  example,  if  the  model  to  be  built 
must  predict  all  admissible  sequences  of  length 
|5|  >  «(,  by  concatenating  primitives,  one  should 
minimize  the  number  of  forbidden  words  appear¬ 
ing  in  this  process  across  the  junction  between 
consecutive  primitives  In  fact,  if  00  is  forbidden 
and  the  bare  symbols  0  and  1  are  chosen  as 
primitives,  the  tree  has  a  pruned  branch  at  level 
2  (obtained  when  trying  to  concatenate  0  with  0); 
if,  instead,  the  primitives  are  1  and  01,  no  topo¬ 
logical  “defect”  occurs  (see  appendix  A  for  more 
details). 

In  the  following  we  suppose  that  a  set  of  code 
words  has  been  identified:  the  technical  descrip¬ 
tion  of  a  specific  algorithm  can  be  found  in 
appendix  A.  All  admissible  sequences  are  clas¬ 
sified  hierarchically  by  constructing  a  tree  (see 


Fig.  2.  First  two  levels  of  the  logic  tree  for  the  roof  map  at 
fl  =  (3  -  ■\/3)/4  (see  appendix  A  for  a  description  of  the 
construction  method). 


fig.  2,  for  an  example),  the  vertices  of  which  are 
labelled,  at  the  first  level,  by  the  primitives 
themselves.  The  /th  level  (/  =  1, .  .  .  ,  3c)  contains 
concatenations  of  /  such  words.  All  branches 
leaving  a  generic  vertex  5  point  to  the  allowed 
extensions  Sw.,  ...  of  sequence  S  (refine¬ 
ments  of  subset  4^).  li  the  signal  is  aperiodic, 
there  are  branching  vertices,  corresponding  to 
strings  with  more  than  one  possible  continuation. 
The  method  illustrated  in  appendix  A  yields  a 
code  which  satisfies  the  normalization  condition 
XiPiWi)  =  \.  Furthermore,  the  probabilities 
obey  the  Kolmogorov  consistency  condition 
P(5)  =  P{Swi),  so  that  every  complete  level  of 
the  tree  represents  a  full  covering  of  the  phase 
space  (more  precisely,  of  the  Poincare  section 
i'):  in  fact,  1  =  .S,  P{w^)  =  2',  2,  P(w,»v,),  and  so 
on.  Trees  constructed  in  this  way  are  equivalent 
to  “generalized”  Markov  models,  which  describe 
the  dynamics  as  a  sequence  of  events  w,,  occur¬ 
ring  with  measurable  transition  probabilities,  ac¬ 
cording  to  unknown  rules  (determined  by  the 
dynamical  map  /  in  eq.  (3.1)).  Since  the  se¬ 
quences  have  variable  length  at  every  level,  in 
general,  the  memory  extent  depends  on  the 
probabilities  of  the  orbits.  The  order  of  such 
models  can  be  estimated  by  the  “average  Mar¬ 
kov  time”  per  level  [5] 

0  =  lim7  2  \S\P(S)  (3.3) 

*  level  / 

which  equals  1  for  ordinary  trees  (also  called 
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block  code  [6],  where  the  symbols  themselves 
are  taken  as  primitive).  A  description  based  on 
these  Markov  trees  provides  a  tool  for  the  under¬ 
standing  of  the  power  spectrum  S{f)  of  the 
signal  5^0-  I”  fhe  next  sections,  the  topological 
and  metric  (probabilistic)  features  of  the  tree  will 
be  related  to  the  shape  of  5(/),  by  turning  the 
description  into  an  effective  model  for  the  gener¬ 
ation  of  signals  which  are  “statistically  close”  to 
5^0-  Before  doing  this,  we  briefly  illustrate  the 
criterion  for  the  choice  of  suitable  primitive 
words. 

A  particularly  relevant  role  in  determining  the 
recurrence  (long-term)  properties  of  nonlinear 
dynamical  systems  is  played  by  the  set  O  of 
“non-wandering”  points  [23]:  a  point  x  is  non¬ 
wandering  for  the  map  /  if,  for  any  neighbour¬ 
hood  U  of  jr,  there  exists  a  number  «  >  0  such 
that  f"{U)  n  U  0  (an  analogous  definition  can 
be  given  for  a  subshift  &).  Correspondingly,  a 
symbolic  sequence  S  which  labels  a  domain  in 
fl  will  be  observed  in  the  signal  with  a 
well-defined  frequency  of  occurrence.  The  set  fl 
consists  of  points  with  a  weak  recurrence  proper¬ 
ty:  in  particular,  all  periodic  points  of  /belong  to 
n  [29].  This  property  suggests  a  useful  criterion 
to  distinguish  primitive  words  from  “transient” 
(i.e.,  non-recurrent)  ones.  Following  ref.  [5],  we 
define  a  primitive  as  a  string  w  which  can  be 
periodically  extended  up  to  the  maximum  inves¬ 
tigated  block  length  and  which  does  not 
contain  a  prefix  with  the  same  property;  e.g.,  001 
is  a  primitive  if  001001  ...  is  allowed  and 
000 ...  is  not  (see  appendix  A  for  details).  This 
is  a  “strong”  condition:  namely,  one  which  does 
not  depend  on  tuning  parameters  or  on  statistical 
weights.  Weaker  conditions  (depending,  e.g.,  on 
the  probabilities  P{S))  are  discussed  in  ref.  [5]. 
Recall  that  periodic  orbits  are  dense  on  the 
non-wandering  sets  of  hyperbolic  (axiom-A) 
dynamical  systems  [23]. 

Our  analysis  is  not  restricted  to  signals  pro¬ 
duced  by  nonlinear  dynamical  systems  but  can  be 
applied,  more  generally,  to  any  stationary  one¬ 
dimensional  symbolic  pattern,  such  as  a  spin 


configuration  generated  by  a  Monte  Carlo  pro¬ 
cess  or  by  a  cellular  automaton  [30 j.  The  study 
of  power  spectra  of  parallel-generated  sequences 
has  received  much  attention  in  the  theory  of 
one-dimensional  quasi-crystals  [31].  All  of  these 
systems  exhibit  a  rich  structure  of  subsequences 
with  a  recurrent  character.  If  only  the  data  are 
given  and  the  nature  of  the  physical  process  is 
unknown,  no  theoretical  argument  can  be  used 
to  obtain  asymptotic  estimates  (i.e.,  concerning 
infinitely  long  orbits).  In  such  a  case,  the  main 
limitation  of  our  approach  is  represented  by  the 
size  of  the  available  computer  memory:  with  a 
binary  alphabet,  one  is  usually  forced  to  choose 

"max  ^  22. 

With  generic  chaotic  signals  it  is  possible  to 
find  primitives  which  are  periodically  extendable 
up  to  if  the  signal  is  sufficiently  long  (i.e.,  if 
|5^„|  is  at  least  of  the  order  of  10'r”"'“,  where  r  is 
the  number  of  symbols  in  the  alphabet  A„).  The 
analysis  has  been  performed  for  the  logistic 
equation  at  various  parameter  values,  for  several 
other  one-  and  two-dimensional  maps  and  for 
some  chaotic  flows.  The  first  few  primitives  of 
the  Henon  map  at  (a  =  1.4,  ft  =  0.3)  are  w,  =  1, 

W2  =  01,  w,  =  0011101,  W4  =  0011111,  W5  = 
00111101,  w^  =  00011101,  w,  =  00011111,  w|,= 
000111101,  w,  =  0011110011101,  and  so  on.  The 
simple  examples  of  period  doubling  and  quasi¬ 
periodicity  are  treated  in  the  appendix.  The 
whole  unfolding  procedure  is  applicable  indepen¬ 
dently  of  the  existence  of  a  Markov  partition  [23] 
(if  one  exists,  a  finite  set  of  primitives  is  always 
found:  the  converse  is  not  true). 


4.  Iterated  coding  and  renormalization 

Before  describing  the  models  which  can  be 
constructed  using  the  information  stored  on  the 
tree,  we  discuss  a  further  very  important  step  in 
the  unfolding  scheme.  It  consists  of  an  iterative 
procedure  which  enables  us  to  achieve  higher 
accuracy  and  to  analyze  automatically  also  phe- 
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nomena  at  the  border  of  chaos  (ergodic,  non- 
mixing).  The  subdivision  of  the  signal  into  primi¬ 
tive  words  constitutes  a  coarse  graining  in  the 
time  direction,  which  corresponds  to  enlarge¬ 
ments  of  some  regions  of  phase  space.  Because 
of  the  arbitrariness  of  the  acceptance  condition 
for  the  code  words  (periodic  extendability,  e.g.) 
and  of  the  presence  of  resolution  parameters 
imposed  by  the  finiteness  of  the  amount  of  data 
(«max>  ”cut’  see  appendix  A),  this  procedure  can 
be  viewed  as  an  approximate  language-recogni¬ 
tion  method  [32].  The  nature  of  the  physical 
process  is,  in  fact,  unknown  and  the  primitives 
are  not  expected,  in  general,  to  reflect  fully 
asymptotic  properties  of  the  system.  Also,  the 
resulting  hierarchical  description  may  not  be  the 
most  compact  one.  This  is  particularly  evident  at 
PD  and  QP,  where  the  signal,  although  gener¬ 
ated  sequentially,  can  be  more  efficiently  de¬ 
scribed  by  means  of  a  parallel  algorithm.  On  the 
other  hand,  the  dynamics  itself  may  be  explicitly 
of  parallel  type,  as  in  Monte  Carlo  updates  of 
spin  chains  or  in  cellular  automata.  The  identifi¬ 
cation  of  the  long-ranged  “coherent”  structures 
appearing  ra  such  systems  is  particularly  difficult 
and  requires  consideration  of  increasingly  long 
blocks  of  symbols.  The  coarse  graining  of  the 
signal  into  code  words,  in  such  cases,  can  never 
be  considered  as  definitive.  Therefore,  it  is 
necessary  to  resort  to  a  higher-level  modelling 
procedure,  which  provides  the  unfolding  method 
with  an  explicit  parallel  mechanism.  To  this  aim, 
the  primitive  words  w,,  w^,  ■  ■  ■  detected  in  the 
original  signal  9’^  are  renamed  with  symbols  from 
a  new  alphabet  A,  =  {0,  1, .  . .}  and  the  whole 
analysis  is  repeated  on  the  transformed  string  y, 
thus  obtained.  The  iteration  of  this  procedure 
yields  a  progressive  coarse  graining  of  the  signal 
(corresponding  to  an  increase  of  resolution  per 
symbol  in  phase  space).  Once  the  primitives  have 
been  recorded  in  a  table,  they  can  be  easily 
recognized  while  scanning  the  source  signal  se¬ 
quentially,  since  the  method  illustrated  in  the 
appendix  always  yields  instantaneous  (prefix-free 
[6])  codes.  The  description  of  the  image  signal 


y*,  obtained  after  k  recoding  steps,  consists  of 
the  derived  tree  and  of  the  code  which  keeps 
track  of  the  previous  block-renaming  cascade 
(i.e.,  of  the  relations  between  each  symbol  in  the 
alphabet  and  its  pre-image  string  in  .y„).  The 
recodin^,  procedure  is  equivalent  to  a  renormali¬ 
zation-group  transformation  on  the  nonlinear 
map  /  associated  with  the  subshift  [27]:  ir  fact, 
renaming  a  sequence  5  of  length  n  with  a  single 
symbol  corresponds  to  considering  the  «th  iter¬ 
ate  of  /  in  the  phase-space  element  For 
perfectly  self-similar  languages,  such  as  those  of 
PD  and  OP,  the  trees  obtained  at  each  step  are 
identical:  i.e.,  an  exact  renormalization  is  readily 
achieved  [5].  For  signals  of  purely  sequential 
nature,  instead,  only  a  few  recoding  steps  are 
usually  possible  and  useful.  However,  recoding 
always  yields  more  asymptotic  estimates  of  the 
observables  associated  with  the  symbolic  se¬ 
quences.  For  example,  after  the  substitution 
(w,,  Wj,  w,)— »(0,  1,  2),  in  the  logistic  map  at 
a  =  1.85,  all  strings  which  previously  appear¬ 
ed  as  overlaps  between  consecutive  primitives 
(like  10,  or  100  in  . .  .  w,W2W,W2W,W2. .  .  = 

. . .010110100101  . . .  110121 .  . .  ,  for  ex¬ 
ample)  simply  disappear.  In  the  new  signal, 
overlap-free  probabilities  P'  are  therefore  com¬ 
puted  and  unnecessary  sequences  are  automati¬ 
cally  neglected  in  the  tree  description.  Recalling 
that  in  the  thermodynamic  formalism  for  non¬ 
linear  dynamical  systems  [3]  the  role  of  the 
Hamiltonian  is  played  by  //(j,  ...5„)  = 
In  P{s^  .  .  .  s„)  [33],  it  is  readily  seen  that  recod¬ 
ing  yo— >  y,  and  recomputing  the  probabilities  in 
y,  is  a  completely  analogous  operation  to  obtain¬ 
ing  a  renormalized  block  Hamiltonian  H'  = 
In  P'{w^  .  .  .  w^)  (with  |w,|  >  1,  m  <  n).  In  the 
case  of  the  logistic  map  at  a  =  1.85,  we  find 
P(l)=«  0.5896,  P(01)«  0.3236,  P(001 )«  0.0868 
in  y„  and  P'(0)==  0.451,  P'(l)« 0.401,  P'(2) « 
0.148  in  y, .  Another  advantage  of  the  renaming 
process  is  that  the  code  redundancy  may  be 
reduced:  in  fact  the  whole  signal  at  PD  can  be 
encoded  with  the  first  two  primitives  w,  and  w, 
only,  although  P(w,)  +  P(w2)<  1  in  and  all 
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Other  code  words  never  occur  in  the  renaming 
procedure. 

5.  Reconstruction  of  power  spectra 

The  trees  obtained  with  the  procedure  illus¬ 
trated  in  the  previous  section  can  be  used  to 
predict  the  values  of  physical  observables  associ¬ 
ated  with  a  level  /  sequence  S,  from  the  knowl¬ 
edge  of  those  corresponding  to  the  orbits  allo¬ 
cated  at  levels  1  to  /-  1.  In  ref.  [5]  a  notion  of 
complexity  was  introduced,  related  to  the  diffi¬ 
culty  of  inferring  the  structure  of  the  system  at 
increasingly  finer  levels  of  resolution,  by  means 
of  estimates  based  on  coarser-scale  measure¬ 
ments.  In  particular,  considering  the  prob¬ 
abilities  P(S),  it  is  possible  to  study  sequence-to- 
sequence  transitions  and  to  reconstruct  signals 
with  statistical  properties  which  closely  approxi¬ 
mate  those  of  the  actual  one.  To  this  aim,  the 
dynamics  is  modelled  by  means  of  a  sequence  of 
Markov  processes  (one  for  each  level  /),  defined 
by  conditional  probabilities,  such  as  P(5'|5), 
which,  in  turn,  are  estimated  from  the  prob¬ 
abilities  of  lower-level  sequences.  The  accuracy 
of  the  model  increases  with  the  order  /  of  the 
level.  Although  no  exact  predictions  of  the  actu¬ 
al  time  dynamics  are  possible,  in  general,  this 
method  approaches  the  limit  of  an  “optimal” 
description  in  a  statistical  sense:  the  reconstruc¬ 
ted  signals,  in  fact,  exhibit  expectation  values 
and  transition  probabilities  which  are  in  close 
agreement  with  those  of  the  system  under  inves¬ 
tigation.  The  dynamics  is  represented  as  a  suc¬ 
cession  of  deterministic  paths  (blocks  of 
symbols)  which  appear  at  random  in  time,  ac¬ 
cording  to  the  measured  probabilities.  The  usage 
of  variable-length  words  furnishes  a  simple  inter¬ 
pretation  of  the  structure  of  power  spectra,  in 
addition  to  improvements  in  the  convergence  of 
the  approximations  and  in  the  compactness  of 
the  description.  In  turn,  the  convergence  of  the 
reconstructed  power  spectra  with  the  order  / 
provides  a  test  for  the  validity  of  the  model. 


The  set  of  all  admissible  trajectories  can  be 
generated  with  the  help  of  a  transition  matrix  M 
whose  elements  A/,^  =  P(5j5;)  represent  the 
conditional  probability  of  observing  sequence  . 
given  5,  .  They  refer  to  the  level  /  strings  (com¬ 
posed  of  /  primitives  each)  and  satisfy  the  rela¬ 
tion  Z  Af,y  =  1,  where  N(l)  is  the  number  of 
orbits  at  level  /.  A  bi-infinite  sequence  ^  = 
...  5,  ,5,  5,  .  .  .  is  admissible  if  A/,,  >0  for 

all  n  e  (.  .  .  ,  -1,  0,  1,  .  .  .).  The  dynamics  is  then 
described  by  a  shift  map  T{ff)  =  with  5^'  = 
5y  ^  ^ ,  which  advances  by  /  primitives  at  a  time  (at 
variance  with  a).  When  the  population  N(/)  of 
level  /  is  finite,  the  set  of  all  admissible  sequences 
together  with  map  t.  is  a  generalization  of  a 
subshift  of  finite  type  [23]  with  transition  matrix 
M.  In  practice,  JV(/)  is  bounded  by  the  number  of 
primitives  A'(l,  Wmax)'  found  with  a  finite  value 
of  raised  to  the  power  /. 

Symbolic  trajectories  SjSj ...  equivalent  to 
the  actual  signal  can  be  generated  with  the  fol¬ 
lowing  simple  algorithm.  Starting  with  sequence 
S,(l  ^  MO)'  jfs  successor  5*  is  determined  by 
choosing  a  random  number  u  from  the  uniform 
distribution  over  the  interval  (0,  1)  and  by  com¬ 
paring  it  with  the  values  A/^y,  for  all  j:  the  index 
1  <  A:  MO  of  the  successor  is  the  integer  which 
satisfies  the  relation  A/,y  <  m  <  2*,,,  A/,y.  Of 
course,  if  some  recoding  steps  had  been  per¬ 
formed  during  the  unfolding  process,  each  sym¬ 
bol  in  the  current  sequence  S*  has  to  be  trans¬ 
lated  back  to  its  pre-image  word  in  the  set  of 
primitives  found  in  the  analysis  of  the  original 
signal  in  the  case  of  the  logistic  map  at 
a  =1.85,  for  example,  the  sequence 
102  is  read  as  011(K)1  =  )/r(102),  where  i^(0)  =  1, 
<^(1)  =  01  and  t^(2)  =  001.  The  signal  is  hence 
generated  by  the  composition  of  the  Markov 
shift  f  with  one  application  of  the  substitution 
For  PD  and  QP,  of  course,  not  a  single  Markov 
step  is  needed,  since  the  infinite  iteration  of  the 
codes  i/>pp  or  i/^Qp  already  produces  the  exact 
signal  with  any  input  string.  In  general,  the 
reconstructed  signals  are  given  by  iteration  of  a 
composition  of  the  form  tp"'T,  for  sc  le  lA  (pos- 
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sibly,  the  identity),  with  1.  Some  analytical 
results  have  been  recently  obtained  for  the 
power  spectra  of  the  invariant  strings  of  pure 
substitutions  (including  PD  and  QP)  which  satis¬ 
fy  certain  restrictions  [31,  34].  The  exactly  self¬ 
similar  structure  of  such  systems  allows  writing 
of  recursive  relations  directly  for  the  Fourier 
amplitudes.  In  the  next  section,  we  present 
analytical  results  for  a  few  Markov  models  de¬ 
scribing  the  simplest  examples  of  incomplete- 
folding  dynamics.  So  far,  an  analysis  of  signals 
arising  from  a  mixed  parallel-seq>  'ntial  dy¬ 
namics  has  not  been  attempted. 

The  calculation  of  the  matrix  elements  (condi¬ 
tional  probabilities)  A/,y  relative  to  transitions 
between  level  /  orbits  requires,  in  principle,  tak¬ 
ing  into  account  a  memory  extending  over  21 
primitives.  In  fact,  the  probability  P{W\V)  to 
observe  the  sequence  W  =  w^w■^  .  .  .  after  V  = 
UjUj  ...  U;  is  defined  as  the  ratio  P(y,  . . . 

. .  .  w,)l  P(y,  . . .  Vi).  Since,  in  general,  it  is 
impractical  to  consider  levels  /  and  21,  because  of 
the  relevant  length  of  the  sequences  involved, 
the  conditional  probabilities  can  be  approxi¬ 
mated  by  comparing  two  successive  levels  (i.e.,  / 
and  /  -I- 1).  The  resulting  model  has  then  a  mem¬ 
ory  limited  to  /  primitives  and  reduces  to  a  usual 
/th  order  Markov  process  once  the  probabilities 
have  been  evaluated  in  the  recoded  signal.  The 
value  of  P{W\V)  is  then  estimated  as 

P(h',  .  .  .  w,  I  y,  .  .  .  y,)  «  P(w,  |  y,  .  .  .  y,) 

X  P(W'2  \V2  .  .  .  V,W^)  X  .  .  . 

X  P(w,|y,w,  .  .  .  H',_,)  ,  (5.1) 

where  Pia,^l\a^  .  .  .  a,)  =  P{a^  .  . .  a,^,)/P(a, 
...  a,),  as  usual.  All  memory  effects  available  up 
to  level  /  are  included,  since  the  dependence  on 
the  last  /  symbols  is  explicitly  taken  into  account. 


6.  Model  convergence  and  scaling  functions 

The  validity  of  the  approximation  (5.1)  and 
the  convergence  of  the  Markov  models  with  the 


order  I  can  be  discussed  within  the  general 
framework  of  the  thermodynamical  formalism 
foi  dynamical  systems  [3]  and  related  to  the 
measure  of  complexity  introduced  in  ref.  [5]. 
The  product  on  the  r.h.s.  of  eq.  (5.1)  appears  in 
fact  in  the  coefficients  of  the  /th  power  of  the 
transfer  matrix  [3,  5] 


.  .  .  >v,,  I 

P(w,  .  .  .  w,)  "'I'’! 


(6.1) 


for  a  lattice  of  (variable-size)  spin  blocks  w,. 
Matrix  T  describes  the  conditional  probability  for 
the  transition  between  the  strings ...  w  ,>  [ 
.  .  .  rv]  .  .  .  and  .  .  .  w^W2  .  .  .  w,^^  .  .  .  upon  (icft) 
shifting  of  the  signal  by  one  primitive  at  a  time. 
The  two  sequences  are  “connected”  only  if  the 
last  /  words  of  the  left  one  coincide  with  the  first 
/  of  the  right  one  (the  image),  as  taken  into 
account  by  the  Kronecker  5’s  in  eq.  (6.1).  Such 
a  transfer  matrix  is  usually  introduced  [35,  4,  5] 
to  obtain  asymptotic  estimates  of  the  generalized 
metric  entropies  K{q)  [1]  from  a  suitable  eigen¬ 
value  equation:  the  function  K{q)  plays  the  role 
of  a  free  energy,  while  the  parameter  q  is  related 
to  the  temperature  [33].  The  convergence  prop¬ 
erties  of  the  associated  thermodynamic  sums  are 
therefore  the  same  as  those  of  the  Markov  mod¬ 
els  constructed  in  the  previous  section,  since  in 
both  problems  the  relevant  object  is  the  transfer 
matrix  T.  A  detailed  description  of  the  asymp¬ 
totic  behaviour  of  the  conditional  probabilities 
can  be  graphically  represented  by  means  of  a 
scaling  function  a{t)  [9]  which,  for  our  present 
purposes,  is  defined  as  follows.  The  level  I  se¬ 
quences  5  =  W|W2  .  .  .  tv,  are  mapped  onto  the 
unit  interval  by  associating  a  value  t=t{S)  to 
each  of  them.  For  every  t-t{S)  the  level  / 
approximation  cr,(t)  of  ihe  scaling  function  a(t) 
is  given  by  the  conditional  probability 
P(S  =  tv,  .  . .  w,)/P(wj  . .  .  w,_,).  The  definition 
of  the  ordering  parameter  t  =  t(5)  E  [0,  1]  can  be 
easily  understood  by  referring  to  the  logic  tree  of 
fig.  2.  At  level  /=  1,  the  primitives  are  con- 
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sidered  in  the  same  order  as  on  the  tree  (e.g., 
from  left  to  right)  and  /(w*)  =  /(w;,.  ,)  +  P(w^), 

^  =  1 . MU  "max)  being  the  empty 

string,  with  /(w^)  =  0).  Hence,  a^{t)  is  piecewise 
constant  over  N{1,  n^ax)  intervals.  The  generic 
A:th  interval  is  split,  at  level-two,  into  subinter¬ 
vals  labelled  by  all  sequences  ending  with 
and  ordered  from  left  to  right  according  to 
the  order  of  the  iv^’s  on  the  first  level  of  the  tree. 
In  this  way,  we  have  + 

^nd  the  index  k  is  increased  by  one 
after  that  all  y’s  have  been  scanned.  At  level- 
three,  all  subintervals  of  w-w,^  are  labelled  as 
WjW^Wk,  with  the  w/s  ordered  as  usual,  and  so 
on.  For  example,  with  a  binary  tree,  the  values 
of  f  correspond,  from  left  to  right,  to  the  se¬ 
quences  0  and  1  at  level  one,  to  00,  10,  01  and 
11,  at  level  two,  to  000,  100,  010,  110,  001,  101, 
Oil  and  111,  at  level  three.  The  widths  of  the 
intervals  are  just  the  corresponding  sequence 
probabilities,  so  that  forbidden  strings  do  not 
appear  at  all  [5].  As  an  illustration,  we  display  in 
fig.  3  a  schematic  plot  of  the  first  two  approxi¬ 
mations  to  cr(/)  for  a  tree  with  the  same  topology 
as  that  of  fig.  2.  Notice  that  some  intervals  are 
split  only  into  two  parts  and  not  three,  due  to  the 
prohibitions.  We  have  used  the  short  notation  F, 
for  F(w,),  Pjj  for  P{wiW^)  and,  similarly,  cr,  for 
^■(^(•^/))  and  cTjy  for  cr(/(w,w^)).  In  fig.  4  we  show 


Fig.  3.  Schematic  plots  of  the  first  two  levels  of  approxi¬ 
mation  to  the  scaling  function  a{t)  for  a  tree  with  the  same 
topology  as  th,it  'lispU^ed  in  fig.  2. 


Fig.  4.  Approximations  to  the  probability  scaling  function 
ait)  of  the  Lorenz  flow  at  standard  parameter  values,  ob¬ 
tained  from  the  first  seven  levels  of  the  associated  logic  tree 
(dotted  lines:  /=  1, ....  6;  solid  line:  /  =  7).  The  ordering 
parameter  /  is  defined  in  the  text  . 

the  approximations  a,{t)  obtained  from  the  first 
seven  levels  of  the  tree  for  the  Lorenz  system 
[36] 

i  =  10(>'-x), 
y  =  28x  -  y  -  xz  , 

z  =  —Szl3  +  xy  .  (6.2) 

The  Poincare  section  J  =  {(x,  z):  i  =  10(y- 
x)  =  0,  xsgn(x)<0},  which  is  obviously  inter¬ 
sected  by  all  trajectories  and  has  the  same  sym¬ 
metry  as  the  complete  flow,  has  been  divided 
into  two  regions  by  means  of  the  line  x  =  0  (for 
other  choices  of  the  parameters,  the  generating 
partition  is  instead  ternary  [37]).  Notwithstand¬ 
ing  the  absence  of  the  two  period-one  cycles,  a 
complete  binary-tree  description  is  appropriate, 
because  prohibitions  appear  only  for  |5|>24> 
”inax  [38].  Within  the  statistical  fluctuations,  a 
fast  convergence  is  observed  for  increasing  /, 
indicating  that  Markov  models  of  order  as  low  as 
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5  or  6  already  reproduce  the  symbolic  dynamics 
with  very  good  accuracy.  The  memory  vanishes 
exponentially,  as  it  can  be  seen  by  inspecting  fig. 
4,  where  the  distances  between  consecutive  ap¬ 
proximations  roughly  decrease  as  2”'.  Notice  that 
a  perfectly  self-similar  set  is  characterized  by  a 
piecewise-constant  asymptotic  scaling  function; 
for  system  (6.2),  instead,  the  apparent  continuity 
of  a{t)  indicates  the  existence  of  an  infinity  of 
scaling  rates  for  the  probabilities.  The  attractor 
is,  in  fact,  close  to  an  intermittent  situation  [39]. 
Convergence  of  cr,(r)  to  a  function  a-(t),  for 
/— >00,  indicates  that  the  system  is  simple,  rela¬ 
tively  to  the  chosen  unfolding  scheme,  and  that 
the  derived  model  accurately  describes  the  scal¬ 
ing  behaviour  of  the  probabilities. 

7.  Numerical  results 

The  roof  map  (see  the  appendix,  eq.  (A.l))  at 
a  =  (3  -  V3)  /4  is  both  topologically  and  metri¬ 
cally  simple  [5]:  the  power  spectra  of  the  recon¬ 
structed  signals  are  indistinguishable  from  the 
true  one  already  at  level  1  =  2  (where  the  longest 
forbidden  sequence  h'3»v,=0011  is  detected). 
Due  to  the  (piecewise)  linearity  of  this  map,  the 
probabilities  of  compound  sequences  nearly  fac¬ 
torize.  In  order  to  show  the  effect  of  the  non¬ 
linearity,  we  present,  in  fig.  5,  the  power  spec¬ 
trum  5(/)  of  the  actual  symbolic  signal  (solid 
line)  of  the  logistic  map  at  a  =  1.85,  compared 
with  those  of  the  approximations  obtained  from 
the  level-one,  -three  and  -five  Markov  models 
(dotted  lines).  The  area  under  the  curve  S{f) 
versus  /  (average  over  2000  spectra  of  length 
40%)  has  been  normalized  to  1 ,  after  subtracting 
the  zero-frequency  component,  and  the  natural 
logarithm  of  S{f)  has  been  plotted.  The  non¬ 
linearity  of  this  map  renders  the  estimates  less 
accurate,  for  equal  /  values,  than  in  the  previous 
case,  although  the  two  systems  exhibit  nearly  the 
same  topological  tree  structure.  Since  incomplete 
folding  leads  to  descriptions  in  terms  of  variable- 
length  orbits,  the  position  of  the  peaks  in  the 


Fig.  5.  Comparison  between  the  power  spectrum  S(f)  of  the 
symbolic  signal  of  the  logistic  map  x’  =  i  —  I.SSx’  (thick  line) 
and  those  obtained  from  the  level-one.  -three  and  -five 
reconstructions  (dotted  lines). 

spectrum  finds  an  easy  interpretation.  For  exam¬ 
ple,  the  primitives  in  both  cases  have  lengths  1,  2 
and  3:  therefore,  a  broad  peak  centred  at  a 
frequency  between  5  and  |  is  expected  (recall  that 
/=  5  is  the  largest  measurable  frequency  in  a 
discrete-time  signal).  The  exact  position  depends 
on  the  probabilities  of  the  corresponding  se¬ 
quences.  The  quadratic  maximum  in  the  logistic 
equation  produces  square-root  singularities  in 
the  invariant  measure  at  the  forward  images  of 
x  =  0.  In  particular,  the  probability  of  sequences 
containing  several  zeros  is  higher  than  in  the 
hyperbolic  case.  This  explains  the  shape  of  the 
main  peak  in  fig.  5,  determined  by  the  primitives 
01  and  001,  which  is  sharper  than  the  corre¬ 
sponding  one  of  the  roof  map  and  is  less  accu¬ 
rately  reproduced  for  the  same  /  -values.  Forbid¬ 
den  sequences  at  low  levels  in  the  tree  are 
responsible  for  the  appearance  of  secondary 
peaks,  one  of  which  is  visible  in  the  figure, 
around  /  =  0.09.  These  mechanisms  will  be  dis¬ 
cussed  in  more  detail  in  the  next  section,  where  a 
hierarchy  of  exactly  solvable  problems  will  be 
introduced.  In  fig.  6,  we  show  the  results  ob- 
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Fig.  6.  Same  as  in  fig.  5  for  the  Henon  map  at  standard 
parameter  values:  the  approximations,  in  this  case,  have  been 
obtained  using  level  one,  two  and  three. 


tained  for  the  Henon  map  at  (a  =  1.4,  b  =  0.3), 
where  the  population  of  level  one  has  been 
truncated  after  nine  primitives  (inclusion  of  more 
code  words  did  not  change  the  quality  of  the 
picture)  and  reconstructions  have  been  per¬ 
formed  with  level  one,  two  and  three.  It  is 
clearly  seen  that  the  primitives  (of  lengths  1,  2, 
7,  8,  etc.)  account  for  the  main  peaks  at  fre¬ 
quencies  (close  to)  ^ ,  7  and  g .  The  smaller  peak 
around  /  =  |  corresponds  to  the  level-three  orbit 
5  =  0111.  In  this  case,  the  higher  levels  contain 
orbits  of  length  larger  than  Since  their 

probabilities  cannot  be  estimated  directly,  ap¬ 
proximated  values  have  been  calculated  using  a 
predictor  of  the  same  type  as  those  discussed 
above. 

All  spectra  displayed  in  the  figures  have  been 
obtained  by  recoding  the  original  signals  into 
primitives  and  by  computing  the  probabilities  P' 
in  the  image  signal  5^,.  The  improvement  is 
considerable,  especially  for  the  logistic  and 
Henon  maps  (spectra  reconstructed  without  re¬ 
coding  are  reported  in  ref.  [20]).  For  PD  or  QP, 
iteration  of  the  recoding  procedure  immediately 
yields  the  (unique)  exact  signal,  starting  from  any 


seed  word,  without  need  of  choosing  sequences 
at  random  with  the  transition  matrix  M. 

The  unfolding  method  requires  the  detection 
of  variable-length  code  words  whenever  they 
provide  the  minimal  encoding  of  the  source  sig¬ 
nal.  However,  it  is  always  possible  to  construct 
ordinary  (block-code  [6])  trees,  where  equal- 
length  sequences  are  allocated  at  each  level  (i.e., 
the  primitives  coincide  with  the  symbols  of  the 
alphabet),  even  in  the  presence  of  incomplete- 
folding  dynamics.  The  whole  modelling  proce¬ 
dure  remains  unchanged,  except  that  the  descrip¬ 
tion  is  less  compact  (as  explained  before,  many 
unnecessary  strings  are  considered),  less  asymp¬ 
totic  estimates  are  obtained  (short  strings  with¬ 
out  recurrence  properties  do  not  reflect  average 
properties  of  the  system)  and,  especially,  no 
recoding  (renormalization)  is  possible.  The  ac¬ 
curacy  achievable  using  block-code  trees  at  some 
level  /  of  resolution  is  lower  or  comparable  with 
that  obtained  from  the  variable-length  method  at 
a  level  /' »  UQ  (where  Q  is  the  average  Markov 
time  of  eq.  (3.3)),  without  recoding:  for  the 
logistic  map  at  a  =  1.85,  0  =  1.86  and  the  level- 
three  results  shown  in  fig.  5  are  quantitatively 
close  to  those  given  by  a  level-six  binary-tree 
model.  If,  however,  the  leftmost  peaks  in  the 
spectrum  correspond  to  frequencies  significantly 
(i.e.,  at  least  twice)  smaller  than  1/0,  the  conver¬ 
gence  of  our  method  is  much  more  rapid:  this  is 
the  case  of  the  Henon  map,  for  which  the  peaks 
at  frequencies  \  and  §  appear  already  in  a  level- 
one  reconstruction  using  only  7  primitives.  With 
a  block  code,  of  course,  one  needs  at  least  8 
levels  (i.e.,  a  considerable  number  of  sequen¬ 
ces).  The  parameter  0  is  a  measure  of  the  mod¬ 
el’s  code  compression  [6]  and  depends,  for  a 
given  system,  on  the  criterion  used  to  define  the 
primitives:  the  value  0  =  1  indicates  no  compres¬ 
sion  (block  code). 

8.  Invariance  under  smooth  coordinate  changes 

The  reconstruction  of  the  symbolic  signal  is 
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based  on  two  ingredients  which  are  invariant 
under  smooth  coordinate  changes:  the  topology 
and  the  metrics  (sequence  probabilities)  of  the 
logic  tree.  When  dealing  with  real  signals  x„  (as, 
e.g.,  experimental  time  series),  however,  it  is 
necessary  to  consider  the  values  assumed  by  the 
recorded  observable  x  as  well.  This  gives  the 
non-invariant  part  of  the  spectrum  S{f).  Of 
course,  the  full  signal  and  the  associated  sym¬ 
bolic  one  yield,  in  general,  different  power  spec¬ 
tra.  The  topological  rules  reflect  the  complicated 
folding  mechanism  acting  on  phase  space  and 
represent  the  most  fundamental  dynamical  prop¬ 
erties  of  the  system,  while  the  probabilistic  fea¬ 
tures  can  affect  the  shape  of  the  spectra  only 
within  the  constraints  of  the  topological  struc¬ 
ture.  They  are  particularly  relevant  in  the  non- 
hyperbolic  case,  because  of  the  presence  of  sin¬ 
gularities  in  the  invariant  measure.  Finally,  the 
effect  of  the  actual  jc-values  on  S(f)  is  weaker 
than  or  comparable  to  that  of  the  probabilities. 
As  an  example,  consider  the  well-known  Ber¬ 
noulli  and  tent  maps  [23]  which  yield  the  same 
symbolic  dynamics  (a  random,  5-correlated,  se¬ 
quence  of  O’s  and  Ts)  but  are,  respectively,  order 
preserving  and  order  reversing.  In  the  former 
case,  the  spectrum  of  the  signal  Jt„  is  Lorenzian 
(with  half-width  A  =  In  2)  whereas,  in  the  second 
one,  it  is  white  (as  that  of  the  symbolic  signal). 
As  a  further  illustration  of  these  differences,  we 
show,  in  fig.  7,  the  spectrum  of  the  symbolic 
signal  of  the  Lx)renz  system  and,  in  fig.  8,  the 
corresponding  one  for  the  x-variable  on  the 
Poincare  section:  an  average  over  2000  trajec¬ 
tories  of  length  N  =  40%  has  been  taken.  In  this 
case,  the  two  spectra  have  nearly  the  same 
shape,  apart  from  a  rescaling  of  the  vertical  axis: 
the  symbolic  dynamics  can  be  approximated  by  a 
nonlinear  version  of  the  Bernoulli  shift  which 
does  not  reach  completely  the  two  fixed  points  at 
x  =  0  and  x  =  l,  where  the  motion  is  close  to 
intermittency  [39].  Therefore,  the  peak  visible  in 
the  figures  around  frequency  zero  is  mainly  de¬ 
termined  by  the  distribution  of  the  times  spent 
by  the  system  in  the  intermittent  regions,  rather 


Fig.  7.  Same  as  in  fig.  .S,  for  the  Lorenz  system  at  standard 
parameter  values,  using  level  one,  two  and  three,  (the  latter 
reconstruction  being  nearly  coincident  with  the  true 
spectrum). 


Fig.  8.  Power  spectrum  of  the  jc-coordinate  in  the  Poincare 
section  2  =  {(jt,  z):  jt  =  10(>’ -  x)  =  0,  isgn(x)<0}  of  the 
Lorenz  system  at  standard  parameter  values. 


than  by  the  x-values  (as  it  was  instead  for  the 
Bernoulli  map),  and  it  is  not  Lorenzian.  Indeed, 
also  in  the  much  more  complicated  case  of  the 
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Henon  map  only  minor  modifications  of  the 
overall  shape  are  imputable  to  the  coordinate 
values.  These  can  be  included  in  the  recon¬ 
structed  signals  by  associating  to  each  sequence  S 
a  number  n  =  (5|  of  x-values  given  by  =  (x)^ 
(the  center  of  mass  of  phase-space  element 
-^1  =  where 

/*(J)  denotes  the  feth  image  of  set  A  under  the 
action  of  map  /.  When  the  symbolic  string  S  is 
chosen,  the  real  sequence  {xq,  . .  .  ,  is  at¬ 
tached  to  the  reconstructed  signal  to  be  Fourier 
transformed.  A  further  improvement  is  obtained 
by  adding  to  each  center  of  mass  a  (Gaussian- 
distributed)  random  number  with  variance  = 
-  Jc*]*  ^  which  is  also  evaluated  from 
the  coordinates  of  the  phase-space  points  falling 
in  element /’‘(A^)  [20].  In  this  way,  the  dynamics 
is  split  into  a  macroscopic  part,  given  by  the 
(finite-time)  jumps  between  elements  in  phase 
space,  and  a  microscopic  random  contribution 
which  represents  the  unresolved  motion  within 
each  element  of  the  covering. 

Finally,  notice  that  the  power  spectrum  of  a 
continuous-time  system  may  differ  substantially 
from  that  computed  on  the  Poincare  section  S, 
when  the  return  times  of  the  trajectory  on  X 
have  a  broad  distribution  (cf.,  for  example,  figs. 
7  and  8  with  the  spectrum  of  the  full  signal 
generated  by  the  Lxirenz  system  (6.2),  reported 
in  ref.  [40]).  The  hierarchical  modelling  pro¬ 
posed  in  ref.  [5]  and  illustrated  above  can  be 
complemented  with  local  maps  which  approxi¬ 
mate  the  real  trajectories  from  a  phase-space 
element  on  the  Poincare  section  1  to  its  image 
on  X  itself.  These  transformations  are  deter¬ 
mined  by  fitting  the  orbits  with  suitable  polyno¬ 
mials  [37]. 

9.  A  hierarchy  of  Markov  models 

The  intuitive  interpretation  of  the  shape  of 
power  spectra  given  in  the  previous  section  can 
be  rendered  more  rigorous  by  studying  a  few 
analytically  solvable  examples.  We  consider  Mar¬ 


kov  models  corresponding  to  the  simplest  five 
subshifts  of  finite  type,  in  a  hierarchy  character¬ 
ized  by  an  increasing  number  (or  length)  of 
forbidden  words.  The  resulting  topology  of  the 
tree  will  determine  the  main  features  in  the 
spectrum,  whereas  variation  of  the  transition 
probabilities  will  cause  smooth  modifications, 
within  the  limits  imposed  by  the  topology. 

The  finite  automaton  which  describes  a  given 
regular  language  can  be  represented  by  a  dir¬ 
ected  graph  D  =  (K,  A),  where  F=  {n,, .  .  .  ,  v^} 
is  a  set  of  vertices  and  A  =  {  .  .  .  .  .  .  }  is  a 

set  of  ordered  pairs  of  vertices,  called  arcs  [25] 
(arc  v^v,  obviously  leading  from  n*  to  v,).  The 
arcs  are  labelled  with  symbols  from  the  al¬ 
phabet  =  {0, .  .  .  ,  r  -  1}  (which  we  take  as 
binary,  for  simplicity),  where  /  ranges  from  1  to 
the  number  of  arcs  in  the  graph.  At  most 
r  =  2  arcs  can  leave  a  vertex:  the  language  is  the 
set  of  the  symbolic  words  generated  by  following 
any  allowed  path  on  the  graph.  Furthermore, 
each  arc  is  charactered  by  a  transition  prob¬ 
ability  p^i  which  satisfies  the  relation  1,  /?*,  =  !, 
V*  (i.e.,  there  is  a  unit  probability  to  leave  any 
vertex  v^).  When  the  automaton  is  used  to  gen¬ 
erate  a  signal,  a  random  choice  is  made  at  each 
bifurcating  vertex,  according  to  the  attached 
probabilities  (see  fig.  9).  Therefore,  the  process 
can  be  described  by  a  Markov  transition  matrix 
R,  analogous  to  those  considered  in  section  5. 
Notice  that  no  memory  is  taken  into  account  in 
these  simple  models:  a  given  arc  is  followed  with 
a  given  probability,  independently  of  the  past,  at 
variance  with  what  happens  when  using  trees, 
where  all  allowed  paths  are  explicitly  considered, 
together  with  the  respective  weights.  The  au¬ 
tocorrelation  function  C(t)  can  be  computed 
analytically  from  the  knowledge  of  the  graph  by 
following  a  straightforward  procedure  [41].  The 
necessary  information  is  stored  in  the  matrix  R, 
whose  generic  element  R..  represents  the  prob¬ 
ability  of  the  arc-to-arc  transition  s,— this 
quantity  can  be  easily  computed  from  the  arc 
(vertex-to-vertex  transition)  probabilities  p*,  de¬ 
fined  above.  Representing  the  binary  symbols 
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(j,,  S2*  •  •  •  }  attached  to  the  arcs  (taken  in  any 
order)  and  their  corresponding  steady-state  prob¬ 
abilities  {■77,,  iTj, . .  . }  as  vectors  s  and  tt,  the 
autocorrelation  function  C(t)  =  {s^Sj )  -  (s)', 
T  =  0,  1, . . . ,  is  given  by  the  expression  [41] 

C(t)  =  1ft  •  •  s  -  {tt  •  s)^ ,  (9.1) 

where  R’^  is  the  rth  power  of  R  and  HT  =  (77,5,, 

772^2,  .  .  .  }. 

We  discuss,  with  the  help  of  this  procedure, 
five  Markov  models  related  to  continuous 
piecewise-linear  maps,  showing  that  the  presence 
of  forbidden  sequences  causes  the  emergence  of 
“coherent”  structures,  starting  from  a  flat 
spectrum. 

As  already  mentioned,  the  symbolic  dynamics 
of  the  tent  map  (eq.  (A.l)  with  <2  =  0)  is  a 
random  process  with 

=  ifT  =  0, 

^  '  [0,  otherwise, 

where  p  =  P(0),  so  that  5(/)  =  1,  V/.  A  well- 
known  metric  modification  of  such  a  complete- 
topology  system  is  provided  by  the  telegraphic 
signal  [42],  whose  spectrum  is  a  Lorenzian  cen¬ 
tered  at  the  origin.  The  simplest  topological 
alteration  of  the  full  binap'  ♦ree  is  obtained  by 
forbidding  the  sequence  00:  his  case  corre¬ 
sponds  to  the  roof  map  at  u  -  The  resulting 
primitives  are  then  1  and  01 ,  so  that  a  complete 
binary  tree  is  recovered.  The  transition  matrix  R 
has  the  three  real  eigenvalues  A,  =  0,  Aj  =  1  and 
Aj  =  -p,  where  p  =  P(01)  =  1  -  P(l).  Using  eq. 
(9.1),  one  finds  [41] 

Hence,  the  power  spectrum  is  a  Lorenzian  cen¬ 
tered  at  /=  5. 

Our  third  model  shows  how  a  peak  at  a 
generic  frequency  0  <  /  <  |  is  produced.  We  con¬ 
sider  the  roof  map  at  a  =  (3  -  V5)/4,  where  the 
dynamics  yields  all  combinations  of  the  three 


primitives  1,  01  and  001,  being  000  the  only 
forbidden  word.  The  critical  point  belongs  to  a 
period-four  orbit  with  kneading  sequence  ]22] 
CIOO.  The  corresponding  Markov  graph  is  dis¬ 
played  in  fig.  9a.  The  5x5  matrix  R  can  be 
reduced  to  a  3x3  matrix  with  two  non-trivial 
eigenvalues  (the  other  one  being  equal  to  1), 
which  are  the  roots  z  =  z^  and  2  =  z,  of  the 
equation  z~  +  pz  +  pq  =  0.  The  meaning  of  the 
parameters  0<p<  1  and  0  <  9  <  1  is  clear  from 
the  figure.  Accordingly,  one  obtains 


C(T)  = 


\+  p  +  pq 


T+1  I  I 

r,  +  k,z,  ), 


where  =  (z^ -1- <?)(!- z^)’‘ (2^  -  p^)’ ’.  When 
the  2^’s  become  complex,  the  power  spectrum 
(see  fig.  9d)  presents  a  peak  at  a  frequency 
between  5  and  |,  depending  on  the  values  of  p 
and  q.  This  can  be  easily  verified  by  writing  the 
roots  in  the  exponential  form  Zy  =  Py  exp(iwy) 
and  computing  the  angular  part  for  the  limit 
values  0  or  1  which  p  and  q  can  assume.  In  fact 
the  changes  in  the  probabilities  modify  the  shape 
of  the  spectrum  only  within  the  bounds  imposed 
by  the  topology:  the  lengths  of  the  primitives  (1, 
2  and  3)  clearly  indicate  the  region  in  which  a 
peak  is  expected. 

The  last  two  examples  concern  the  appearance 
of  a  second  peak,  related  to  the  presence  of  a 
further  prohibition  in  the  language.  In  particular, 
we  consider  two  equivalent  cases  by  taking  either 
three  primitives  with  a  level-two  forbidden  con¬ 
catenation  or  two  primitives  with  a  level-three 
prohibition.  The  first  model  has  the  same  topolo¬ 
gy  as  the  roof  map  at  a  =  (3  -  V)/4,  where  the 
forbidden  sequences  are  000  (determining  the 
primitives  1,  01  and  001)  and  0011  =  The 
corresponding  graph  is  displayed  in  fig.  9b.  The 
6x6  matrix  R  has  three  non-trivial  eigenvalues 
Zj,  given  by  the  roots  of  the  equation  z’  -t-  pz‘  -l- 
pqz  +  {pq  -  q)  =  Q  and  the  correlation  function 
reads 
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where 


^  ^  +  p(}-  +  ZiP<i  -  q) 

’  2>P(1  -  +  2q)  +  iz]  -  qf  ■ 

Three  power  spectra  are  shown  in  fig.  9e,  for 
q  =  0.7  and  various  choices  of  p.  A  second  peak 
around  frequency  zero  develops  when  p  de¬ 
creases  from  1.  The  final  example  has  the  same 
tree  topology  as  the  roof  map  at  a  =  | ,  where 
the  dynamics  yields  all  concatenations  of  the  two 
primitives  1  and  01,  except  for  01101  =  h/jWiW, 
(see  fig.  9c).  The  critical  point  belongs  to  a 
period-five  orbit  with  kneading-sequence  ClOll. 
Also  in  this  case  the  transition  matrix  has  size  6 
and  three  non-triviai  eigenvalues,  roots  of  the 
equation  -1- pz^  +  ( p  +  q  -  l)z  +  pq  =  0.  The 
correlation  function  has  the  same  form  as  in  the 
previous  case: 


C(t)  = 


1 


q  +  2p+pq 


2  kjz]  , 


where 


(9.6) 


[(1  -  p)z]  +  pz^f-l-(p  +  q-  l)z,  +  pq] 

X  [qz]  +  pqz]  +  9( p  +  9  -  l)2y  +  P9( 9  +  1)1 

[(1  -  +  9  -  1)^  +  2pqzj{2z]  +  g  -  1)1  '  . 


The  power  spectrum,  shown  in  fig.  9f,  exhibits  a 
second  peak  to  the  left  of  frequency  ^  ;  (cor¬ 
responding  to  the  orbit  0111,  which  is  the  only 
allowed  extension  of  011),  in  addition  to  the  first 
one,  centered  around  /  =  ^  and  determined  by 
the  primitives  (of  lengths  1  and  2).  In  fact,  every 
prohibition  corresponds  to  a  vertex  with  a  single 
outgoing  arrow  (for  a  binary  alphabet)  in  the 
transition  graph  D,  so  that  relatively  short  cycles 
can  appear  in  D  with  a  small  number  of  exit 
branches  (and  an  associated  small  escape  prob¬ 
ability).  Finally,  notice  that  position,  height  and 
width  of  the  peaks  depend  on  the  eigenvalues  of 
the  transition  matrix  R,  which  cannot  be  simply 


related  to  properties  (lengths,  Lyapunov  expo¬ 
nents)  of  the  unstable  periodic  orbits  in  the 
dynamical  map  /. 


10.  Conclusions 

We  have  shown  that  power  spectra  of  non¬ 
linear  dynamical  systems  are  determined  by  the 
topological  and  metric  properties  of  the  symbolic 
dynamics,  representable  on  a  logic  tree,  and  by 
the  actual  numerical  values  of  the  recorded  ob¬ 
servable  X.  The  former  two  ingredients  are 
dynamical  invariants  and  affect  the  shape  of  the 
spectra  more  deeply  than  the  latter  one,  which  is 
obviously  noh-invariant.  Therefore,  the  evalua¬ 
tion  of  power  spectra  (which  is  often  the  simplest 
operation  to  perform  on  experimental  data)  as¬ 
sumes  a  new  importance:  in  fact,  the  presence  of 
broad  peaks  can  be  immediately  related  to  the 
topology  of  the  logic  tree  (primitives  and  first 
few  levels).  This  information  constitutes  a  guid¬ 
ance  in  the  construction  of  simple  Markov  mod¬ 
els  and  in  the  search  of  the  relevant  unstable 
periodic  orbits  of  the  system.  Moreover,  the 
spectra  contain  information  about  the  metric  fea¬ 
tures  of  the  dynamics  and  possibly  depend  on  the 
coordinate  values  x.  Finally,  we  have  evaluated 
numerically  the  probability  scaling  function  for 
all  examples  presented,  displaying,  in  particular, 
the  results  obtained  for  the  Lorenz  system.  The 
relation  between  the  convergence  of  the  Markov 
models  and  the  scaling  function  has  been  dis¬ 
cussed  in  terms  of  a  suitable  complexity 
measure. 
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Appendix  A 

In  this  appendix  we  illustrate  an  algorithm  for 
the  identification  of  periodically  extendable 
primitives.  This  is  a  generalization  of  the  method 
proposed  in  ref.  [5].  The  purpose  is  to  select 
those  strings  which  exhibit  some  distinguished 
recurrence  properties,  as  explained  in  section  3. 
All  substrings  S  of  the  signal  are  tested  in 
order  of  increasing  length  n,  starting  from  the 
single  symbols  (n  =  1)  themselves.  We  want  to 
individuate  either  periodic  sequences  (i.e.,  cycli¬ 
cally  repeating  up  to  the  largest  available  string- 
length  or,  if  none  are  found,  orbits  which 
can  be  extended  periodically  up  to  a  limited  total 
length  (number  of  symbols) 

Single  symbols  are  considered  first:  those 
which  are  found  to  be  n„|„-extendable  (i.e.,  such 
that  s""""  is  observed  at  least  once  in  5^„)  are 
accepted  as  primitives,  while  all  others  are  clas¬ 
sified  as  “transient”.  Choosing  (as  we 

usually  do)  is  equivalent  to  requiring  full 
periodicity.  If,  however,  no  unstable  fixed  point 
lies  infinitesimally  close  to  the  attractor,  this 
condition  may  not  be  fulfilled  for  large  (if 
no  symbol  repeats  enough  many  times  consecu¬ 
tively  in  5^0 ).  When  this  happens,  all  blocks  of 
length  2  must  be  examined.  The  -extendable 
ones  are  then  taken  as  primitives.  All  others 
which  have  positive  probability  are  transient  or¬ 
bits  (notice  that  some  blocks  of  length  2  might  be 
forbidden  altogether).  If,  in  turn,  no  primitive 
block  of  length  2  has  been  found,  those  of  length 
3  are  analyzed,  and  so  on  up  to  a  cutoff  length 
Meut  — "min-  ^  Still  tto  primitive  has  been  iden¬ 
tified,  the  acceptance  condition  is  weakened  to 
an  extendability  length  «  -  1  and  the 

whole  process  is  restarted  from  the  beginning 
(n  =  1).  In  this  way,  primitives  of  some  length 
n  <  are  necessarily  found  (all  other  allowed 
strings  with  the  same  length  n  being  tiansicnl 
orbits)  since,  in  the  worst  case,  can  decrease 
down  to  itself.  The  parameter  is  intro¬ 
duced  only  because  of  computer  memory  limita¬ 
tions,  as  well  as  and  represents  the  longest 


range  in  which  we  try  to  detect  differences  in  the 
recurrence  behaviour  of  the  various  orbits.  If  the 
procedure  is  unsuccessful  (i.e..  if  reaches  the 
value  and  all  orbits  with  that  length  exist), 
an  ordinary  block-code  tree  is  constructed.  Usu¬ 
ally,  <  n^J2  (to  identify  possible  blocks  of 
length  having  at  least  one  full  cyclic  ex¬ 
tension). 

In  general,  the  set  of  primitives  found  with  the 
above  procedure  is  still  incomplete:  all  longer 
ones  are  missing,  unless  a  block  code  has  been 
chosen. 

Example  I.  Consider  the  piecewise-linear 
roof  map  defined  by 

^  I  a  -I-  2(  1  -  a)jr„  .  if  .v„  6  [0.  i )  (.v„  =  0)  . 

12(1 -.rj.  if.v„e[l.ll(.s„  =  I). 

(A.l) 

For  a  =  (3- V3)/4  [5],  a  Markov  partition  ex¬ 
ists:  in  fact,  the  critical  point  x  =  s  belongs  to  an 
unstable  period-five  orbit  with  symbolic  label 
(kneading  sequence  [22])  C1(X)1.  where  the  let¬ 
ter  C  denotes  x  =  I  itself.  The  unit  interval  can 
be  divided  into  three  subsets,  labelled  by  the 
sequences  w^  =  \  (interval  [5,  1]),  w,  =01  (left 
pre-image  of  element  1)  and  w,  =  001  (left  pre¬ 
image  of  01).  The  symbolic  dynamics  yields  all 
possible  random  combinations  of  these  three 
primitive  strings,  except  for  the  forbidden  orbit 
iVjW,  =0011:  this  is  the  minimal  description, 
which  we  want  to  discover  from  the  analysis  of 
the  signal  5^„.  Clearly,  only  symbol  1  is  periodi¬ 
cally  extendable:  symbol  0  occurs  at  most  twice 
consecutively  in  the  signal  and  is.  therefore,  trans¬ 
ient.  In  fig.  2,  we  display  the  corresponding  logic 
tree:  symbol  1  is  allocated  at  level  one.  whereas 
symbol  0  is  reported  only  for  convenience 
of  illustration,  encircled  with  a  dashed  line. 

The  construction  of  the  tree  then  proceeds  by 
forming  new  sequences  S'  =  Sw  as  concatena¬ 
tions  of  any  admissible  string  S  (including  tran¬ 
sient  orbits)  with  a  primitive  w.  If  sequence  S'  in 
turn  exists  (i.e.,  if  P(5')>0),  it  is  allocated  on 
the  tree,  at  the  position  determined  by  its  paren- 
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tal  relations.  The  possible  combinations  of  length 
two  are  01  (transient-primitive)  and  11  (primi¬ 
tive-primitive),  both  admissible.  Sequence  11  is 
the  descendent  of  1  and,  therefore,  appears  at 
level  2.  Sequence  01  is  the  second  primitive:  in 
fact,  transient  strings  are  attributed  a  virtual 
level  0  and  all  primitives  formed  by  concatena¬ 
tions  have  a  transient  prefix.  Eventually,  level  1 
will  contain  all  primitives.  Transient  orbits  play  a 
role  only  in  the  search  of  the  code  words;  they 
are  totally  neglected  once  the  latter  ones  have 
been  all  found  (or  when  all  sequences  up  to 
l-^l  ~  '^max  have  been  considered).  The  possible 
combinations  of  length  3  are  then  001,  Oil,  101 
and  111;  001  is  the  third  primitive,  011  and  101 
are  allocated  at  level  two  and  111  at  level  three. 
Notice  that  sequences  such  as  10,  010  or  100  are 
not  even  generated  with  this  procedure:  they  are 
irrelevant  for  a  left-to-right  sequential  recon¬ 
struction  of  the  signal.  When  forming  the  se¬ 
quences  of  length  4,  we  discover  that  0001  and 
0011  do  not  exist:  they  are  topological  prediction 
errors  and  appear  encircled  with  a  solid  line  in 
fig.  2.  When  a  new  combination  is  formed,  if  it 
happens  to  contain  one  of  these  forbidden  orbits 
(as,  for  example,  10011  =  w'iH'jW,),  it  is  obvious¬ 
ly  rejected  a  priori.  Notice  that  a  more  re¬ 
dundant  model  for  the  symbolic  dynamics  would 
be  obtained  using  the  four-element  Markov  par¬ 
tition,  with  the  associated  nine  prohibitions.  The 
same  tree  is  obtained  for  the  logistic  map  (3.2), 
at  a  =  1.85,  up  to  the  second  level  (differences 
appearing  from  level  three). 

For  a  generic  language,  it  may  happen  that,  at 
the  end  of  the  process,  some  tree  level  is  not 
complete  because  it  should  contain  sequences 
longer  than  the  probability  of  which  cannot 
be  measured.  This  is  the  case  of  the  Henon  map 
at  standard  parameter  values.  Then,  a  normali¬ 
zation  to  one  of  the  total  probability  of  level  one 
is  carried  out  (if  the  code  is  not  complete,  other¬ 
wise  this  is  automatically  obtained)  and  the  rest 
of  the  tree  is  completed  by  using  suitable  predic¬ 
tors  (see  the  next  section)  for  the  probabilities 
P(5)  of  the  missing  sequences  S.  The  errors 


introduced  by  this  procedure  are  negligible  if 
is  sufficiently  large.  Finally,  notice  that 
primitive  sequences  have  little  in  common  with 
“prime”  sequences  (strings  which  cannot  be  de¬ 
composed  into  a  number  /:  >  1  of  repetitions  of  a 
single  subword  [4]). 

Example  2.  Periodic  chaos  is  also  automatical¬ 
ly  handled;  if  the  attractor  consists  of  p  chaotic 
regions,  which  alternate  periodically,  the  lengths 
of  the  primitives  are  all  multiples  of  p.  No  block 
shorter  than  p,  in  fact,  can  be  extended  periodi¬ 
cally  to  arbitrary  length:  for  the  detection  of  the 
“right”  code  words  it  is  sufficient  to  choose 
"cui  —  P-  This  is  equivalent  to  studying  the  pth 
iterate  of  the  map,  which  admits  a  generating 
partition  composed  of  r’’  elements  (the  pth  re¬ 
finement  of  the  original  one). 

Examples  3  and  4.  In  the  signals  generated  at 
PD  and  OP  (see  section  3),  instead,  no  substring 
repeats  consecutively  more  than  twice.  In  fact, 
one  has:  =  01 101001  ...  at  PD  and  y„  = 

10101101  ...  at  QP.  The  appropriate  coarse- 
graining  of  the  signal  is  hence  automatically  ob¬ 
tained  when  n  has  decreased  to  4  or  2,  respec¬ 
tively,  yielding  the  code  words  w,  =01,  w,  =  10, 
Wj  =  (K)10,  W4  =  1101,  .  .  .  (for  PD,  with  =2) 
and  w,  =  1,  w,  =01  (for  QP,  with  =  1).  In 
the  former  case,  although  the  whole  signal  can 
be  encoded  with  the  two  words  01  and  10  only, 
an  infinity  of  primitives  is  found,  since  the  rela¬ 
tion  Xj  P{Wj)=  1  holds  by  construction.  Usage  of 
a  larger  yields  directly  a  higher-order  coarse 
graining,  in  which  the  primitives  w'  are  the 
images  under  of  those  given  above:  for 
"cut  “  one  finds  the  second-generation  encod¬ 
ing  >V;  =  0110=  l/fpD(w,),  W'  =  1001  =  »/tpD(H',), 
and  so  on. 

Our  procedure  yields  a  so-called  prefix-free 
code:  i.e.,  no  code  word  is  the  prefix  of  another 
one.  Hence,  if  one  is  given  the  set  of  primitives, 
it  is  possible  to  recognize  them  immediately  in 
the  signal,  as  soon  as  they  are  received  by  the 
source  [6].  In  this  way,  recoding  of  the  primitives 
into  new  symbols  (w,,  w,, .  .  .  )^(0,  1, .  .  .  )  is  a 
straightforward  task. 
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While  evaluating  scaling  functions  of  fractal  dimensions  and  Lyapunov  exponents  from  time  series  in  the  traditional 
way,  assumptions  are  made  concerning  the  grammar  of  the  underlying  dynamical  system;  it  is  implicitly  assumed  that  the 
length  of  the  substrings  considered  is  sufficient  to  capture  the  grammatical  properties  of  the  system.  In  this  contribution, 
we  show  where  this  assumption  becomes  relevant.  We  give  an  example  of  a  simple  grammatical  rule  which  leads  to  badly 
behaving  convergence  properties  of  the  associated  scaling  functions.  As  another  consequence  of  our  investigations,  we 
conclude  that,  whenever  a  finite  grammar  is  encountered,  the  cycle  expansion  approach  of  Cvitanovic  using  periodic  orbits 
should  preferably  be  used. 


1.  Introduction 

As  it  is  common  use  to  define  a  chaotic  attrac¬ 
tor  as  the  closure  of  its  nonforbidden  periodic 
orbits  [1],  it  should  not  be  too  surprising  that  the 
grammatical  rules,  which  allow  only  part  of  the 
periodic  orbits  to  exist,  exert  a  nonnegligible 
influence  on  the  associated  scaling  functions  (not 
of  the  Feigenbaum  type).  Whereas  the  appear¬ 
ance  of  nonhyperbolic  contributions  can  be 
traced  in  the  scaling  functions  derived  from  time 
series  without  knowledge  of  the  underlying 
grammar,  the  effect  of  changing  grammatical 
rules  is  far  less  easy  to  investigate.  However,  in 

'Correspondence  to:  Priv.-Doz.  Dr.  Jurgen  Parisi,  Physics 
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the  present  contribution,  we  intend  to  make 
clear  that  such  an  influence  can,  nonetheless,  not 
be  neglected:  it  will  be  shown  to  have  a  large 
impact  on  the  form  of  the  associated  scaling 
functions.  Although  in  the  generic  case  of  ex¬ 
perimental  time  series  a  large  number  of 
nonhyperbolic  points  is  to  be  expected  [2],  we 
restrict  ourselves  to  a  hyperbolic  model,  in  order 
to  be  able  to  work  out  this  influence  against 
dominating  nonhyperbolic  effects.  In  the  evalua¬ 
tion  of  experimentally  obtained  time  series,  two 
different  directions  are  commonly  followed.  In 
one  approach  (subsequently  called  the  “prob¬ 
abilistic”  approach),  the  generalized  Renyi  di¬ 
mensions  are  calculated,  from  which  the  well- 
known  entropy-like  scaling  function  f{a)  can  be 
derived.  The  other  approach  deals  with  the  log¬ 
arithmic  stretching  rates  describing  the  stability 
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of  the  system,  called  Lyapunov  exponents,  for 
which  an  entropy-like  scaling  function  can  be 
derived  much  in  the  same  way.  It  is  this  scaling 
function  that  we  consider,  in  order  to  demon¬ 
strate  the  properties  mentioned  above.  Tradi¬ 
tionally,  the  so-called  thermodynamic  averages 
[3,  4]  are  calculated  from  the  “canonical”  parti¬ 
tion  function  [5],  and  only  partial  use  is  made  of 
periodic  orbits.  However,  a  more  recent  work  by 
Cvitanovic  and  collaborators  suggests  that  the 
use  of  a  “grandcanonical-like”  partition,  which 
takes  explicitly  account  of  the  unstable  periodic 
orbits  of  the  system,  should  numerically  be  more 
suited  for  these  purposes  [6]. 

In  our  contribution,  we  consider  different  re¬ 
strictions  imposed  on  a  three-scale  Cantor  set  by 
simple  but  nontrivial  grammars.  We  investigate 
the  effects  they  exert  upon  the  scaling  behavior 
and  discuss  the  relationship  with  experimentally 
obtained  time  series.  We  show  that  the  tradition¬ 
al  approach,  although  not  entirely  worthless, 
leads  to  severe  numerical  problems  of  various 
nature  so  that  it  is  difficult  to  extrapolate  the 
asymptotic  behavior  from  finite  symbolic  sub¬ 
strings  of  the  length  accessible  to  the  computer. 
And,  finally,  we  point  out  that  the  discovery  of 
hidden  grammatical  rules  can  lead  to  a  rather 
drastic  change  of  the  scaling  functions.  To  dem¬ 
onstrate  this  exemplarily,  we  consider  a  three- 
scale  Cantor  set.  For  the  relevance  of  such  a 
model,  see  e.g.  ref.  [7].  The  generalized  entropy 
function  [8,  9]  which  contains  all  the  relevant 
information  on  the  scaling  behavior  of  the  sys¬ 
tem  is  then  calculated,  (a)  from  the  partition 
function  upon  increasing  the  length  of  the  sub¬ 
strings  considered,  (b)  from  the  zcti-function  ap¬ 
proach  of  Cvitanovic  [6]  and  Ruelle  [3].  From 
the  generalized  entropy  function,  the  thermo¬ 
dynamic  averages  can  easily  be  derived. 

2.  Thermodynamic  formalism 

The  thermodynamic  formalism  starts  with  a 
partition  of  the  phase  space  which  is  compatible 


with  the  iteration  of  the  dynamical  map.  Using  a 
partition  consisting  of  M  symbols,  in  analogy  to 
statistical  mechanics,  the  partition  function  [5| 

Z^iq,^.n)=  S  (1) 

/Eli . M)" 

is  defined.  Here  the  size  of  the  yth  region  of 
the  partition  is  denoted  by  1-,  whereas  the  prob¬ 
ability  of  falling  into  this  region  is  denoted  by 
(Pj  =  p(x)  djf,  where  p{x)  means  the  natural 
measure).  (3  and  q  are  sometimes  called  “filtering 
exponents”;  n  denotes  the  “level”  of  the  parti¬ 
tion.  Local  scaling  of  /  and  p  in  n  is  expected.  In 
this  way,  the  length  scale  /  and  the  probability  p 
give  rise  to  scaling  exponents  e  and  a  through 

(2) 

P;  =  /?  ^  (3) 

respectively.  Using  the  above  expression,  from 
the  partition  function  the  generalized  free  energy 
Fq  can  be  derived  [4,  8,  9]: 

F^{q,P)=  hml-\og  E 

(4) 

where  log  denotes  the  natural  logarithm.  A 
generalized  entropy  function  5^(0,  e)  is  then 
introduced  through  the  “global”  scaling  assump¬ 
tion  that  the  number  of  regions  N  which  have 
scaling  exponents  between  (a,  t)  and  (a  -I- 
da,  £  +  df)  scales  as 

N(a,  e)  da  dc  ~  e'^*^''“ da  de  .  (5) 

From  the  generalized  free  energy  F^,  in  a  stan¬ 
dard  way  the  relationship  with  the  generalized 
entropy  5q  is  found  [5-9], 

£)=  F^iq, /3)  +  (aq  +  ^)e  ,  (6) 

where  those  values  of  a  and  e  leading  to  the 
maximum  of  (as  a  function  of  given  q  and  /3) 


R.  Stoop.  J.  Paiisi  /  Invariants  from  finite  symbolic  substrings 


327 


have  been  chosen  [9].  The  free  energy  or  the 
generalized  entropy  describe  in  this  way  the 
scaling  behavior  of  the  dynamical  system  equiva¬ 
lently.  For  jS  =  0,  the  information-theoretical 
Renyi  entropies  evolve  from  (4). 

From  the  generalized  free  energy,  two  more 
specific  free  energies  and  entropies  can  be  de¬ 
rived  by  restriction:  For  q=l,  we  obtain  the 
generalized  Lyapunov  exponents  [10],  whereas 
for  qf  =  0  the  free  energy  is  gained  which  leads  to 
the  maximum  value  of  5c(a,  e)  with  respect  to 
variation  of  a  alone  for  given  e  [11].  The  latter 
free  energy  and  the  associated  entropy  are  de¬ 
noted  by  Fg()8)  and  5q(£),  respectively.  For  the 
discussion  of  a  dynamical  system,  the  scaling 
function  5<3(e)  is,  in  some  respect,  better  suited 
than  the  scaling  functions  of  the  generalized 
Lyapunov  exponents.  The  fractal  dimensions 
(see  refs.  [5,  12,  13])  are  obtained  as  the  zeros 
fioiq)  of  pQ^q,  /3)  for  given  q.  Therefore,  via  (1) 
and  the  generalized  entropy  function,  the  charac¬ 
teristic  thermodynamic  functions  can,  in  princi¬ 
ple,  be  calculated  from  an  approximation  of  the 
system  by  strings  of  a  finite  length  n,  for  ex¬ 
perimental  settings  as  well  as  for  model  systems 

[9]. 

3.  Model  systems 

In  order  to  demonstrate  the  rather  generic 
convergence  properties,  we  consider  the  follow¬ 
ing  situation:  Assume  that  in  the  sequence  space 
the  system  can  be  coded  with  the  help  of  three 
symbols.  Therefore,  as  an  appropriate  model  for 
a  dissipative  dynamical  system,  we  consider  a 
three-scale  Cantor  set  and  use  for  the  symbolic 
description  the  symbols  A,  B,  and  C.  We,  fur¬ 
thermore,  impose  a  grammar  by  the  requirement 
that  a  three-fold  repetition  of  one  symbol,  C, 
cannot  occur.  For  the  symbols  A,  B,  and  C,  the 
length  scales  and  the  associated  probabilities  are 
specified.  At  level  «  =  1 ,  the  following  length 
scales  and  probabilities  are  observed:  =0.1, 
/fl  =  0.2,  /c  =  0.3;  f^=0.2,  Pg  =  0.3,  =  0.5. 


We  th^'n  presume  that  a  hierarchical  discovery  is 
being  rade  in  the  following  sense:  While  :.t  level 
2  the  oranching  probabilities  remain,  at  level  3 
the  particularity  of  the  sequence  CCC,  which  is 
not  allowed,  is  detected.  Fo'  simplicity,  we  as¬ 
sume  that,  as  the  effect  of  this  discovery,  the 
forbidden  substring  is  proportionally  distributed 
to  the  neighboring  branches  CCA  and  CCB  [14, 
15], 

In  this  way,  a  nontrivial  grammar  is  imposed 
on  the  fractal.  In  contrast  to  what  could  bo 
expected,  it  is  not  possible  to  exclude  at  a  given 
level  n,  once  and  for  all,  the  generation  of  the 
forbidden  substring:  The  generation  of  the  frac¬ 
tal  is  “not  closed”  with  respect  to  the  iteration 
number  n.  The  hope  is  then  that  an  increase  of 
the  length  n  oould  remedy  the  situation.  Taking 
into  account  the  structure  of  the  grammar,  it  is 
easily  seen  that  this  expectation  can  only  partial¬ 
ly  be  justified.  In  order  to  investigate  the  conver¬ 
gence  of  the  approach,  we  now  start  from  the 
ensemble  of  all  possible  substrings  of  length  n 
and  calculate  the  generalized  entropy  function 
5(3(a,  e)„.  Then,  we  derive  the  entropy-like 
functions  f{a)„  and  5o(f)„,  for  different  lengths 
n  [16]. 

4.  Results 

In  ref.  [9],  the  form  of  the  generalized  entropy 
function  of  the  unrestricted  Cantor  set  has  al¬ 
ready  been  discussed.  While  the  restriction  is 
imposed,  upon  increasing  the  length  of  the  sub¬ 
strings  considered,  the  support  of  the  entropy 
function  converges  towards  the  modified,  asymp¬ 
totic  form  (fig.  1).  As  a  function  of  n,  we  obtain 
a  rather  oscillatory  convergence  (fig.  2)  which  is 
due  to  the  fact  that  different  symbolic  substrings 
could  not  yet  be  excluded  at  the  given  level  n. 
However,  this  convergence  is  affected  in  part  by 
the  nrmercal  inaccuracy  of  the  computational 
tools.  For  example,  the  numerical  accuracy  is 
limited  by  the  computation  time  used  for  the 
summation  of  the  partition  function  up  to  length 


328 


R.  Stoop,  J.  Parisi  /  Invariants  from  finite  symbolic  substrings 


Fig.  1.  Asymptotic  form  of  the  generalized  entropy  function, 
calculated  from  the  zeta-function  approach.  Shown  is  the 
support  of  So(o,  e),  i.e.,  the  region  in  the  (a,  e)-plane  for 
which  5c(o,  e)  is  a  nonzero,  convex  function.  The  dashed 
line  indicates  the  support  of  5'g(^)  “ 

of  12.  Furthermore,  the  use  of  high  values  of  the 
weighting  exponents  q  and  /3  poses  severe  nu¬ 
merical  problems  (in  our  computations,  we  use  a 
range  from  -30  to  30  for  both  parameters). 
Nevertheless,  it  can  be  seen  that  it  is  possible  to 
unveil  some  remaining,  important  invariant 
properties  of  the  system.  For  instance,  the  maxi¬ 
mum  of  SG(e)  is  rather  well-approximated  by 
Sc(e)„  (see  fig.  3).  The  fact  that  these  entropy 
functions  do  not  come  down  to  zero  could  be 
taken  as  an  indication  of  a  phase-transition-like 
effect  [9,  17].  However,  the  second  approach 
shows  that  this  is  not  the  case.  (The  problem  of 


Fig.  2.  Oscillatory  convergence  towards  the  asymptotic 
generalized  entropy  function  which  is  due  to  the  fact  that  at  a 
given  level  of  n  not  all  sequences  which  would  be  forbidden 
for  larger  n  can  be  excluded  without  suppressing  other, 
allowed,  sequences.  Shown  is  again  the  support  of  the  en¬ 
tropy  function. 


Fig.  3.  Convergence  of  the  entropy  functions  5g(e)„  towards 
the  asymptotic  entropy  function  The  convergence 

is  limited  by  poor  resolution  for  large  n  (see  text). 


phase  transitions  will  be  discussed  extensively  in 
a  forthcoming  publication.)  Note  that  the  asymp¬ 
totic  entropy  function  was  calculated  by  making 
use  of  the  zeta-function  approach  [6]. 

Having  pointed  out  the  intrinsic  superiority  of 
the  “grandcanonical-like”  approach  for  our 
problem,  we  proceed  by  investigating  the  effect 
upon  the  associated,  more  specific  scaling  func¬ 
tions  /(a),  </>(A),  and  ^^(e),  if  the  same  rules  are 
applied  alternatively  to  the  symbol  strings  AAA, 
BBB,  respectively.  In  this  case,  the  effects  on 
the  scaling  functions  are  seen  to  be  of  a  rather 
different  nature:  We  obtain  asymptotically  the 
supports  of  the  generalized  entropy  functions,  as 
shown  in  figs.  4  and  5.  From  these  figures,  the 
effects  of  the  restrictions  exerted  upon  the 
specific  scaling  functions  can  already  be  esti- 


Fig.  4.  Supp)ort  of  the  generalized  entropy  function  for  the 
case  of  forbidden  substring  AAA. 
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Fig.  S.  Support  of  the  generalized  entropy  function  for  the 
case  of  forbidden  substring  BBB. 


Fig.  6.  5Q(e)  obtained  for  the  model  with  forbidden  sub¬ 
string  AAA. 


mated.  In  the  figs.  6  and  7,  we  display  the 
associated  scaling  functions  5G(e).  Note,  how¬ 
ever,  no  phase-transition-like  behavior  is  ob¬ 
tained  in  this  way. 


Fig.  7.  Sc(e)  obtained  for  the  model  with  forbidden  sub¬ 
string  BBB. 


5.  Conclusions 

Let  us  emphasize  that  the  effects  described  are 
very  similar  to  those  observed  during  the  evalua¬ 
tion  of,  e.g.,  the  “dynamical”  or  “geometrical” 
scaling  functions  0(A)  from  experimental  time 
series.  Upon  increasing  the  length  of  the  sub¬ 
strings  considered,  different  shapes  of  the  associ¬ 
ated  scaling  function  0(A)  are  obtained,  seem¬ 
ingly  contradicting  the  law  of  independent  aver¬ 
ages,  due  to  the  nontrivial  role  of  the  grammar 
of  the  system.  At  least  for  experimental  systems 
with  a  finite  grammar,  the  approach  via  zeta 
functions  has  been  shown  to  lead  to  an  accuracy 
of  the  scaling  function  not  known  to  be  attained 
by  the  traditional  approach.  For  infinite  gram¬ 
mars,  however,  the  situation  is  far  less  favorable; 
the  cycle  expansions  typically  converge  but 
poorly. 

Our  results  indicate  that  the  use  of  the  zeta- 
function  approach  for  the  evaluation  of  the 
thermodynamic  averages  can  be  superior  to  that 
of  the  more  traditional  ones.  Much  of  the  success 
of  this  approach,  however,  depends  on  the  fact 
that  the  alphabet  associated  with  our  model  can 
be  reformulated  to  constitute  a  complete  one.  As 
a  consequence,  for  an  experimental  setting  or  a 
model  with  a  complex  grammar,  the  application 
of  this  powerful  tool  may  not  be  straightforward. 
Therefore,  the  evaluation  via  the  more  tradition¬ 
al  approach,  with  all  its  disadvantages,  will  often 
be  a  reasonable  alternative,  if  complete  as  pos¬ 
sible  information  about  the  scaling  behavior  of  a 
system  is  to  be  extracted.  Investigations  of 
nonhyperbolic  systems  are  discussed  elsewhere 
in  detail  [18]. 
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We  propose  a  new  approach  to  the  analysis  of  chaotic  spatial  field  distributions  (including  the  snapshots  of  the  fields 
evolving  with  time),  taking  as  a  basis  a  translational  dynamical  system  with  d  times.  A  space  series  is  used  to  determine 
quantitative  characteristics  of  disorder  such  as  spatial  correlation  and  pointwise  dimensions. 


1.  Introduction 

The  disordered  spatial  field  distribution  (den¬ 
sity,  temperature,  velocity,  etc.)  or  simply  disor¬ 
der  is  usually  analysed  in  terms  of  its  simplest 
statistical  characteristics  such  as  spatial  spec¬ 
trum,  correlation  scale,  and  the  like  [1].  Knowl¬ 
edge  of  these  standard  characteristics  is  extreme¬ 
ly  important  for  the  analysis  of  chaotic  struc¬ 
tures.  They  enable  us,  in  particular,  to  dis¬ 
tinguish  short-range  and  long-range  order,  to 
determine  the  size  of  domains,  to  describe  statis¬ 
tical  properties  of  a  random  field  picture,  and  so 
on.  However,  traditional  approaches  to  the  anal¬ 
ysis  of  spatial  disorder  give  no  information  about 
its  deterministic  origin.  In  particular,  they  do  not 
show  that  a  finite  dimensional  dynamical  system 
may  generate  disorder. 

A  “dynamical”  approach  to  spatial  disorder 
appears  to  be  extremely  interesting.  Actually, 
even  the  fact  that  disorder  may  be  described  by  a 
low-dimensional  dynamical  system  gives  addi¬ 
tional  information  about  its  origin.  Time  vari¬ 
ation  of  the  properties  of  a  “spatial”  dynamical 

'  Current  address:  Mathematics  Institute,  University  of 
Warwick,  Coventry  CV4  7AL,  UK. 


system  that  generates  a  snapshot  answers  the 
question  about  evolutionary  peculiarities  of  dis¬ 
order.  Finally,  “dynamical”  disorder  may  be 
separated  from  spatial  (speckle)  noise. 

Our  paper  is  concerned  with  the  description  of 
spatial  disorder  in  terms  of  nonlinear  dynamics. 
To  this  end  we  introduce  the  concept  of  a  “trans¬ 
lational  dynamical  system”.  Its  meaning  is  readi¬ 
ly  understood  by  use  of  an  example  of  a  one¬ 
dimensional  space  series,  i.e.  an  instantaneous 
field  distribution  with  one  spatial  coordinate.  In 
this  case,  the  spatial  coordinate,  say  x,  can  be 
treated  as  time  and  the  space  series,  m(x),  will  be 
an  observable,  i.e.  a  function  of  the  point  on  a 
trajectory  of  the  dynamical  system  which  will  be 
referred  to  as  a  translational  dynamical  system. 
Employing  the  approach  proposed  by  Packard 
and  Takens  [2,  3],  we  can  reconstruct  the  in¬ 
variant  set  that  contains  this  trajectory  and  calcu¬ 
late,  for  instance,  its  fractal  dimension.  This 
procedure  is  very  much  like  the  processing  of 
time  series,  but  in  our  case  we  know  in  advance 
that  the  translational  dynamical  system  describ¬ 
ing  the  space  series  u{x)  is  “reversible”,  because 
such  a  system  for  isotropic  unbounded  media  (it 
is  only  these  media  that  we  consider  here)  must 
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be  invariant  on  substituting  -x  for  x.  Therefore, 
the  translational  dynamical  system  of  interest 
must  possess  the  properties  similar  to  those  of  a 
Hamiltonian  system. 

Thus  we  can  use  a  direct  spatio-temporal  ana¬ 
logy  for  a  one-dimensional  space  series.  For  d 
spatial  coordinates  this  analogy  can  be  used  in  a 
wider  sense;  then  instead  of  a  traditional 
dynamical  system  we  propose  to  employ  a 
dynamical  system  with  d  times  (the  so-called 
action  of  Z**  or  groups).  These  dynamical 
systems  were  considered,  for  example,  in  ref. 

[4]. 

The  motion  along  the  trajectory  of  such  a 
dynamical  system  in  the  direction  of  any  “time” 

is  equivalent  to  the  shift  of  the  spatial  picture 
along  the  /:th  spatial  coordinate. 

The  concept  of  the  trajectory  of  an  invariant 
set  of  a  translational  dynamical  system  with  “d 
times”  is  essential  for  the  generalization  of  the 
Packard-Takens  approach  that  will  be  proposed 
here.  Besides,  we  will  generalize  Grassberger- 
Procaccia’s  algorithm  [5]  for  the  calculation  of 
the  correlation  dimension  of  space  series  with  d 
spatial  coordinates. 

The  merits  of  this  approach  are  illustrated  by 
an  example  of  chaotic  capillary  ripples  observed 
on  the  surface  of  a  horizontal  layer  of  fluid 
excited  parametrically  [6].  It  was  revealed,  in 
particular,  that  the  spatial  dimension  of  the  snap¬ 
shot  of  ripples  increases  with  supercriticality. 

2.  Translational  dynamical  system 

When  formalizing  the  characteristics  of  spatial 
distribution  snapshots  we  will  need  notions  such 
as  phase  space,  translational  dynamical  system, 
dimension. 

Consider  a  set  of  continuous  (vector-)  func¬ 
tions  u(x),  jcER"*  (wER'’)  employing  conven¬ 
tional  procedures  of  summation  and  scaling.  In¬ 
troducing  into  this  set  a  distance  we  obtain  a 
metric  space  B  that  will  be  referred  to  as  the 
phase  space  of  the  system.  Let  each  d-dimen¬ 


sional  vector  a  =  (a,, .  .  .  ,  a^)  E  correspond 
to  the  translation  map  T“:B— that  is  de¬ 
termined  from  the  expression  T“u{x)  =  u{x  + 
a).  Thus  we  determine  the  action  of  the  group 
R**  on  B  or,  in  other  words,  we  are  concerned 
with  a  dynamical  system  with  d  times  that  will  be 
referred  to  as  a  translational  dynamical  system. 

If  the  process  under  study  is  such  that  knowing 
the  initial  state  (initial  field  distribution)  one  can 
unambiguously  determine  the  subsequent  states 
at  any  moment  of  time,  then  a  semigroup  of 
evolution  operators  also  acts  on  B,  i.e. 

an  evolution  dynamical  system  is  determined  as 
well.  The  behaviour  of  the  trajectories  of  transla¬ 
tional  and  evolutional  dynamical  systems  in  the 
common  phase  space  B  gives  a  full  mathematical 
description  of  the  spatio-temporal  properties  of 
the  nonequilibrium  medium  of  interest. 

3.  The  characteristics  of  snapshots 

The  characteristics  of  the  snapshot  u{x)  do 
not,  evidently,  depend  on  the  reference  system 
in  R”*.  In  other  words,  these  characteristics  must 
describe  an  invariant  set  of  points  along  the 
trajectory  of  a  translational  dynamical  system: 
{T“«(x)}  aGR'' — -4»(x)-  Accordingly,  the  value 
C(A„(jr))  {C{M)  is  the  limit  capacity  of  the  set 
M)  will  be  referred  to  as  the  limit  capacity,  or 
the  fractal  dimension,  of  the  snapshot  ii(x).  The 
Hausdorff  dimension  of  the  snapshot  and  other 
measure-independent  characteristics  are  deter¬ 
mined  in  a  similar  fashion.  If  a  measure  p.  that  is 
invariant  relative  to  T"  is  determined  on 
then  the  /i -dependent  characteristics  (e.g.,  poin- 
twise  or  correlation  dimensions)  will  also  be 
referred  to  as  the  characteristics  of  the  snapshot 
(see  appendix). 

Note  that  if,  for  example,  a  two-dimensional 
snapshot  is  periodic  with  respect  to  x,  and  X2, 
then  the  set  is  merely  a  two-dimensional 
torus;  if  the  snapshot  has  a  quasi-periodically 
repeating  structure,  then  is  also  a  torus  but 
now  of  a  higher  dimension;  while  for  the  pat- 
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teras  chaotically  distributed  over  the  plane, 
will  be  a  fractal  set.  The  time  evolution  of  the 
snapshots  (space  series)  corresponds  to  the  mo- 
tion  of  the  set  ^4,  in  the  space  B:  A A 

Bearing  in  mind  the  analogy  with  ordinary 
dynamical  systems  and  for  the  sake  of  simplicity, 
below  we  will  consider  the  space  and  time  to  be 
discrete,  i.e.  instead  of  we  will  use  T‘  and 
instead  of  R+  we  will  take  Z  +  . 

Generalizing  the  Takens  approach  [3]  to  sys¬ 
tems  with  d-dimensional  time,  we  can  consider 
the  snapshots  to  be  generated  by  a  finite-dimen¬ 
sional  dynamical  system,  provided  that  its  fractal 
dimension  is  finite.  The  snapshot  u{j),  j  = 
(ill  ■  ■  ■  t  called  a  finite-dimen¬ 

sional  one  if:  (1)  there  exists  a  dynamical  system 
with  d-dimensional  time  and  a  finite  phase  space 
Af;  (2)  there  exists  a  Lipschitz-continuous  one- 
to-one  map  h:A^-*M  such  that  the  inverse 
map  A”’  is  also  Lipschitz-continuous.  Here  A^  = 
closure  {  a  E  Z''. 

We  will  provide  the  space  of  the  sequences 
B  =  {«(/■), yeZ**}  with  the  norm  ||ii||  =  \uj\l 
where  |/|  =  ly'il  +  •  •  •  +  lij  and  |«^.|  is  the 
usual  norm  in  R''.  We  can  readily  verify  that  B  is 
a  Banach  space.  Let  for  a  fixed  snapshot  u{j)  the 
fractal  dimension  of  the  set  be  finite: 

C{A^^Jy)  <  00  and  the  set  be  compact.  Let  m  >  0 
be  an  integer  such  that  m‘‘ ^2C{A^^j^)  + 1. 
Through  we  will  denote  an  integral  d-dimen¬ 
sional  cube  with  the  side  m,  i.e.  = 

(;•„  . .  . ,  e  Z*',  0 </,  <  m}.  Let  also  M„  = 

{vUXj^Ci,  v{j)EU'’}.  Apparently,  M„  is 
the  /n‘*-dimensional  subspace  of  the  space  B.  Let 
n„‘.  B-*  be  a  natural  projection,  i.e.  the 
map  which  establishes  the  correspondence  of  the 
snapshot  u{j)  to  the  element  v{j)  E  M„  such 
that  u{j)  =  v{j)  for  anyjEC^.  According  to 
the  Mane  theorem  [7],  one-to-one  (and  bicon- 
tinuous)  projections  are  typical  among  B—*M„ 
on  the  set  A^^Jy  Assume  that  n„  is  a  typical 
natural  projection.  Then  a  dynamical  system 
generated  by  the  map  7“  =  n„°T“° /7” ' ,  a  E 
Z''  and  having  d-dimensional  time  is  determined 
on  the  image  E  =  and  the  snapshot 


«(j)  will  be  finitely  generated,  provided  that 
is  a  Lipschitz-continuous  map. 


4.  Numerical  algorithms 

The  lemmas  proved  in  the  appendix  allow  us 
to  propose  the  algorithms  for  the  calculation  of 
the  correlation  and  pointwise  dimensions 
generalizing  the  algorithms  from  refs.  [5,  8].  Let 
us  take  a  two-dimensional  snapshot  (d  =  2)  in 
the  form  of  a  two-dimensional  array  {u,  i,  ;E 
Z  +  }.  In  reality  the  array,  naturally,  has  a  limited 
size:  i<N^,  y<  Nj.  For  each  integer  m  2: 1  we 
will  construct  mx  m  matrices: 


ATi  =  {iu,,),k  =  Ky...,K  +  m-l, 
I  =  L, .  .  .  ,  L  +  m  —  1)  . 


Let  us  define  the  correlation  integral  in  the  form 


[(N,  -  m)iN2  -  m)] 


2  > 


dist(A<;:i,A<;:>,.)<e}; 


where  #(£)  is  the  number  of  elements  in  the  set 
E. 

Then  the  ratio  log  C*'"'(£)/log  e  for  sufficient¬ 
ly  small  £  can  be  approximately  equal  to  the 
correlation  dimension  of  the  two-dimensional 
snapshot  in  the  m-dimensional  embedding  space. 

Following  ref.  [9]  we  will  estimate  the  minimal 
size  of  the  array  (t/,.y)/v,x/Vj  fhat  is  needed  for 
correct  evaluation  of  the  dimension  within  the 
range  [£',  £"]  of  values  of  £.  Since 


l0g,C<'">(£")-l0g,C*"’(£') 


C‘'">(£')> 


I0g2£"-l0g2£' 
1 


j,  C‘'">(£")<1, 


N\-N 


then,  assuming  e"  =  2^e',  we  have  the  estimate 
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D.^jlog,(NrN,).  (1) 

Note  that  for  a  ^-dimensional  snapshot  such  an 
estimate  has  the  form 

2  ^ 

2  10g2  A^i  . 

Thus,  when  determining  the  correlation  dimen¬ 
sion  of  a  multidimensional  snapshot  one  must 
bear  in  mind  that  the  number  of  discretization 
points  along  each  time  coordinate  may  be  much 
smaller  than  in  the  case  of  a  one-dimensional 
time. 

Since  the  construction  of  the  correlation  inte¬ 
gral  needs  a  great  number  of  calculations  of  the 
distances  between  the  matrices,  we  preferred  to 
compute  the  distance  in  the  form: 

=  fC'  +  m-  1 

l=L . L+m-l 

f  =  L‘ . L+m-l 

(2) 

The  number  of  calculations  may  be  reduced 
significantly  by  comparing  the  matrices  to 
the  array  of  reference  matrices 

{>ijC„L-*^»rcfW  =^7ref}  • 

In  this  case  the  correctness  of  the  calculation  of 
the  correlation  dimension  of  a  two-dimensional 
snapshot  is  determined  by  an  estimate  similar  to 
(1): 

~  ^  ^082(^1  •  ^2  •  <ref  •  /ref)  •  (3) 

The  testing  showed  that  the  behaviour  of  the 
correlation  integrals  does  not,  in  fact,  depend  on 
the  trm  of  the  snapshot  by  an  arbitrary  angle, 
which  indicates  that  the  algorithm  is  robust.  Fig. 
1  shows  the  plots  of  logj  R^'”^{r)  versus  logj  r 


Fig.  1.  The  plot  log,  R‘""(r)  versus  log,  r  for  the  two-dimen¬ 
sional  field  U{x.  v)  =  sin  x  sinV3/2v  for  different  values  m  of 
the  dimension  of  the  embedding  space. 


(r  =  for  a  two-dimensional  snapshot 

y)  =  sin  X  sin  \/l  y  which  is  a  two-dimen¬ 
sional  torus  in  the  corresponding  phase  space. 
The  correlation  dimension  was  calculated  to  be 
D,  e  [1.96;  2.03],  when  /V,,  N,  =  256,  = 

4. 


5.  The  dimension  of  spatial  disorder  for 
capillary  ripples 

The  procedure  for  the  calculation  of  the  corre¬ 
lation  dimension  presented  above  was  employed 
for  the  description  of  the  spatio-temporal  chaos 
of  parametrically  excited  capillary  ripples. 

Experiments  on  the  dynamics  of  capillary 
waves  were  performed  on  a  fluid  placed  in  a  flat 
couvette  vibrating  in  the  vertical  direction.  It  is 
known  [10]  that,  if  the  amplitude  A  of  vibration 
is  higher  than  the  critical  value  A^,  a  system  of 
capillary  waves  is  generated  on  the  surface  of  the 
fluid.  If  the  size  of  the  couvette  is  much  larger 
than  the  wavelength  of  capillary  ripples,  then  the 
parametrically  excited  ripples  are  a  superposition 
of  two  mutually  perpendicular  pairs  of  counter- 
propagating  waves,  independent  of  the  shape  of 
the  couvette,  which  may  be  round  or  square.  As 
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the  supercriticality  s  =  A/A^  -  \  increases,  there 
appear  envelope  waves  propagating  perpendicu¬ 
lar  to  the  originally  excited  pairs.  When  the 
amplitude  of  vibrations  is  sufficiently  high,  the 
system  of  envelope  waves  becomes  chaotic - 
turbulence  is  formed.  This  is  a  quite  general 
scenario  of  the  transition  to  wave  turbulence 
although  its  details  may  differ  in  fluids  of  differ¬ 
ent  viscosities  (see  ref.  [6]). 

It  would  appear  natural  that  the  correlation 
dimension  calculated  by  the  snapshots  of  capil¬ 
lary  ripples  would  increase  with  the  growth  of 
the  supercriticality.  To  prove  these  intuitive 
ideas  we  have  made  quantitative  measurements 
and  processed  them,  employing  the  correlation 
dimension  of  snapshots.  The  correlation  dimen¬ 


sion  has  been  calculated  for  patterns  of  capil¬ 
lary  ripples  for  the  supercriticalities  s  E 
[0.15,  1.85]  (see  fig.  2). 

The  correlation  integrals  corresponding  to  the 
snapshots  are  presented  in  fig.  3.  Our  facilities 
(we  perform  calculations  on  an  EC- 1037  which  is 
comparable  with  an  IBM  4341)  allow  for  the 
processing  of  512x512  matrices  that  are  con¬ 
structed  by  the  images  of  capillary  ripples,  with 
the  number  of  reference  points  being  /V,.  =  8. 

Results  of  calculations  show  that,  as  the  super¬ 
criticality  s  increases  from  0.15  to  1.85,  the  cor¬ 
relation  dimension  of  the  snapshots  increases. 
For  equal  supercriticalities  the  correlation  di¬ 
mensions  are  found  to  be  approximately  equal 
(see  fig.  4). 


Fig.  2.  Snapshots  of  capillary  ripples  for  supercriticalilies:  (a)  s  =  ().l.‘>,  (b)  0.81.  (c)  1.1.  (d)  1.8.5 
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Fig.  3.  Correlation  integrals  corresponding  to  fig.  2:  (a)  £>.  =  6.3,  (b)  7.0,  (c)  7.4,  (d)  8.0, 


Fig.  4.  Correlation  dimension  of  snapshots  of  capillary  rip¬ 
ples  as  a  function  of  the  supercriticality. 


6.  Discussion 

The  analysis  of  the  dimension  of  space  series, 
or  snapshots  of  a  medium  (held)  presented  in 
this  paper  may  be  interesting  from  different 
viewpoints. 


(1)  The  investigation  of  the  temporal  be¬ 
haviour  of  the  dimension  of  spatial  patterns  al¬ 
lows  for  the  new  understanding  of  the  evolution 
of  turbulence  (from  an  initial  state)  and  of  the 
reverse  problem  -  the  establishment  of  organized 
structures  from  spatial  disorder.  In  particular, 
the  dimension,  D^,  of  the  snapshot  is  a  very 
handy  tool  for  the  investigation  of  critical  phe¬ 
nomena  in  nonequilibrium  media.  For  example, 
it  was  revealed  in  ref.  [11]  that  within  a  complex 
Ginzburg-Landau  equation  changes  abruptly 
in  the  transition  from  the  regime  of  phase  turbu¬ 
lence  to  amplitude  turbulence. 

(2)  As  in  the  case  of  dynamical  systems  de¬ 
scribed  by  ordinary  differential  equations  [12], 
one  can  distinguish  in  the  snapshots  a  deter¬ 
ministically  generated  component  against  the 
background  of  spatial  noise. 

(3)  Analysis  of  snapshots  may  employ,  besides 
the  dimension,  other  characteristics  of  deter¬ 
ministic  chaos,  for  instance,  topological  and  met- 
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rical  entropies  (see  ref.  [13])  and,  perhaps, 
Lyapunov  exponents.  Investigation  of  the  time 
evolution  of  these  characteristics  seems  to  be 
promising  for  the  prediction  of  the  temporal 
behaviour  of  snapshots. 


Appendix.  Pointwise  and  correlation  dimensions 
of  snapshots 

Assume  that  A^^^^  =  X  is  a  compact  set  and 
there  exists  a  probability  measure  fi  that  is  in¬ 
variant  relative  to  the  translational  dynamical 
system  7“;  X^X,  a  =  (a,, . .  .  ,  a^)  E  Z‘‘.  As¬ 
sume  also  that  the  measure  /i.  is  ergodic  with 
respect  to  this  dynamical  system  (i.e.  any  set 
DC.  X  such  that  T"D  =  D,  a  E.Z‘‘  has  a  measure 
ti{D)  =  0  or  fi(D)  =  1).  Let  us  designate  through 
Cy  an  integral  ^/-dimensional  cube  with  the  side 
N,  i.e. 

=  {a  =  (a,, .  .  .  ,  0<Q!,  <  N}  . 

By  virtue  of  the  generalized  ergodic  Birkhoff 
theorem  for  a  dynamical  system  with  </-dimen- 
sional  time  [4]  the  relation 


f  1,  yCD 
lO,  y^D  ’ 

and  considering  the  point  u  to  be  typical  with 
respect  to  the  measure  fx,  we  obtain 

lim  L^{u,  £)  =  /it(D(a,  e)) .  (A2) 

Designating  through  a  set  of  points  typical 
relative  to  pt  (by  the  ergodic  theorem,  m(<j^)  = 
1)  we  obtain  from  (A2)  the  following  lemma. 


Lemma  1.  For  any  xCG^, 


'“8(p  =))  -  ‘‘M . 

if  there  exists 


d  (u)  =  lim 

**  e— ►O 


log  D(«,  e) 
loge 


It  should  be  borne  in  mind  that  the  quantity 
d^(u)  is  a  pointwise  dimension  of  the  measure  fi 
at  the  point  «. 

We  now  assume  that 


^  ^(r"«)=[^d^  (Al) 

*K^n)  aeci,  •’ 

is  valid  for  any  function  tpEL  \X,  pt)  and  for  pt, 
of  almost  any  uEX,  where  #(£)  is  the  number 
of  elements  in  the  set  E.  Apparently,  #(C^)  = 
N‘‘. 

Let  us  describe  the  generalizations  of  the  al¬ 
gorithms  for  the  calculation  of  the  dimension  of 
an  ergodic  invariant  set  [5,  8]  to  the  case  of 
(/-dimensional  time. 

For  arbitrary  xEX,N>0  and  e  >  0  we  take 
Lff(u,  e)  =  #{o  E  C^:  dist(7^«,  «)<£}. 

Taking  in  (Al)  <p(y)  =  Aroc.*).  where 
D(u,  e)  =  {yE  X,  dist(H,  y)  :<  e}  is  a  sphere  of 
radius  e  with  the  centre  at  the  point  «  and  a'  is  a 
characteristic  function: 


R^(u,  e)  =  *{{a,p)ECixCt,: 

dist(7“«,  T^u)^e)  , 

and  there  exists  a  limit 

“‘is  fc  '))  • 

(A3) 

We  will  show  that  for  typical  points  u,  D^ipt) 
is  a  correlation  dimension  under  some  additional 
assumptions.  Let  us  represent  e)  in  the 

form 

e)  =  2  L^(h,  r"«,  e) ,  (A4) 

where  L^{u,  v,  e)  =V{a  E  C^:  dist(r"«,  v)  < 
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e}.  Apparently,  u,  e)  =  L^{u,  e).  The 

property  of  ergodicity  yields  (cf.  (A2)) 


in  particular. 


lim  ^  L^{{u,  T“u,  e)  =  ii{D(J“u,  e)) , 

*Qc  yy 


Lemma  2.  If  «  £  G^,  the  convergence  in  (A5)  is 
uniform  with  respect  to  a  EZ‘‘  and  there  exists  a 
limit 


e— »0 


log  J  /x(D(d,  £))  d/i(i;) 


then  D{ti)  =  A(ii). 

Proof.  By  virtue  of  (A5)  we  have 

X  S  ^  L„(u.  r-u. .)) 

=  lim  ( lim 
«— 0  log  e  XAf— «  yv“ 

X  S  M(i?(7’“«,  €)))  . 

aecjC  ^ 

In  the  latter  equality  we  assume  that  the  con¬ 


vergence  of  Lfj(u,  T^u.  e)/N‘^  to  pLiDiT^u.  f)) 
is  uniform  with  respect  to  a.  Further,  because 
uE.G^,  by  virtue  of  the  ergodicity  theorem  we 
obtain 

D„(fi)  =  lim  log  [  jLi(D(y,  e))dp.{y)  -  4(/i)  . 

f-»(i  j 
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The  interaction  and  competition  of  transverse  cavity  modes  in  passive  optical  systems  can  lead  to  the  formation  of 
spatial  field  structures  whose  symmetry  differs  from  that  of  the  empty  cavity,  while  the  time  behaviour  can  become 
irregular.  After  having  introduced  dynamical  equations  in  which  the  spatial  dependence  of  the  electric  field  is  described  by 
frequency-degenerate  Gauss-Laguerre  (GL)  modes,  the  stationary  solutions  corresponding  to  a  single  or  many  modes  are 
calculated  and  their  stability  is  discussed.  A  scan  of  the  parameter  space  allows  for  the  identification  of  several  interesting 
dynamical  regimes.  An  extensive  study  of  the  chaotic  states  is  performed  in  order  to  understand  the  relation  between  GL 
mode  excitation  and  the  relevant  active  degrees  of  freedom.  Based  on  this  analysis,  the  conditions  for  the  onset  of 
spatiotemporal  chaos,  for  its  characterisation  and  experimental  observation  are  finally  discussed. 


1.  Introduction 

Within  the  framework  of  the  theoretical  and  experimental  investigation  of  dissipative  dynamical 
systems,  much  effort  has  been  devoted  to  the  study  of  nonlinear  optical  systems,  which  display  a  rich 
variety  of  periodic,  quasiperiodic  and  chaotic  behaviour  [1].  In  the  modeling  of  optical  systems,  in  order 
to  simplify  the  mathematical  description,  large  use  has  been  made  of  the  plane-wave  approximation, 
which  assumes  the  uniformity  of  the  electric  field  in  the  planes  orthogonal  to  the  direction  of 
propagation  of  the  radiation  beam.  Obviously,  this  approximation  ignores  completely  any  effects  that 
may  arise  in  the  transverse  directions  and  lead  to  the  formation  of  spatial  patterns.  The  theoretical 
modeling  and  systematic  investigation  of  these  transverse  effects  have  recently  received  increasing 
interest,  for  both  passive  and  laser  systems  [2]. 

In  optical  systems,  the  onset  of  spatial  and  spatiotemporal  phenomena  can  be  profitably  described  in 
terms  of  interaction  and  competition  among  the  modes  of  the  empty  cavity.  Relevant  for  the  evolution 
are  the  mode  frequency  and  the  spatial  configuration  of  each  mode  with  respect  to  that  of  the  available 
gain  and  of  the  loss  profile  [3,  4].  This  mode  interaction  can  lead  to  the  formation  of  spatial  patterns 
having  symmetry  properties  different  from  those  of  the  empty  cavity  and  cause  the  onset  of  irregular 
spatiotemporal  regimes  [5].  In  such  situations,  diffraction  assumes  a  role  equivalent  to  that  played  by 
diffusion  in  the  spatiotemporal  chaotic  evolutions  observed  in  hydrodynamical  [6]  and  open  chemical 
systems  [7,  8].  The  theoretical  understanding  of  these  phenomena,  still  very  rudimentary,  is  being 
pursued  following  approaches  typical  to  low-dimensional  chaos-  and  turbulence  theory  [9-11]. 


'  Also  Dipartimento  di  Fisica  deirUniversita,  Milan,  Italy. 
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In  section  2  we  introduce  the  model  equations  describing  the  evolution  of  the  slowly  varying 
envelopes  F  and  P  of  the  electric  field  and  of  the  polarisation  for  a  two-level  absorbing  medium  in  a  ring 
cavity.  Using  the  empty-cavity  modes  to  describe  the  transverse  structure  of  the  field  in  the  filled  cavity, 
and  provided  that  the  absorbing  region  is  much  smaller  than  the  Rayleigh  length  2,,,  one  can  describe 
the  space  dependence  of  tne  electric  field  in  terms  of  frequency-degenerate  Gauss-Laguerre  (GL) 
modes  Ap,{p,  tp),  while  the  system’s  evolution  is  determined  by  the  time  behaviour  of  the  GL  modal 
amplitudes  fp,{t).  In  suitable  conditions,  the  dynamical  equations  admit  stationary  solutions,  whose 
configuration  may  correspond  to  a  single  or  to  many  GL  modes.  In  both  situations  we  identify  some 
prototypical  conditions  in  which  the  steady  state  solution  can  be  obtained  analytically  and  derive  its 
expression.  The  section  is  completed  by  a  linear  stability  analysis  showing  how  the  onset  of  the  modes 
of  a  higher-order  family  (g  =  1)  destabilises  the  singlemode  TEM,,,,  stationary  solution,  giving  rise  to 
more  complex  spatiotemporal  structures. 

In  section  3  we  study  the  dynamical  regimes  that  are  observed  scanning  the  parameter  space.  We 
prove  the  existence  of  points  in  the  transverse  plane  where  the  electric  field  intensity  vanishes;  such 
points  are  singularities  for  the  electric  field  phase  and,  therefore,  can  be  regarded  as  optical  vortices  [3]. 
We  present  some  of  the  many  different  dynamical  patterns  that  the  system  displays  and  identify 
experimentally-accessible  parameter  regions  where  the  time  behaviour  becomes  irregular. 

Section  4  is  devoted  to  the  detailed  study  of  the  irregular  time  regions.  We  test  the  chaoticity  of  the 
evolution  and  the  dimensionality  of  the  attractor  by  measuring  Kolmogorov  metric  entropy  K(l)  and 
information  dimension  D(l),  while  the  relation  between  GL  mode  excitation  and  degrees  of  freedom 
effectively  determining  the  dynamics  is  investigated  by  studying  systematically  the  time  correlations 
among  GL  modal  amplitudes  fp,(t)  and  the  degree  of  spatial  coherence  of  the  field  intensity. 

Finally,  in  section  5,  we  discuss  possible  future  observation  of  spatiotemporal  chaos  in  passive  optical 
systems  in  both  numerical  and  real  experiments. 


2.  Description  of  the  model 

2.1.  Deduction  of  the  dynamical  equations 

We  consider  a  ring  cavity  with  two  spherical  mirrors  1  and  2  having  radius  of  curvature  /?„  and 
transmissivity  T  and  two  perfectly  reflecting  plane  mirrors  3  and  4  (fig.  1).  The  total  length  of  the  cavity 
is  SB,  while  L  is  the  distance  between  the  two  spherical  mirrors  and  is  the  length  of  the  absorbing 
medium,  which  is  assumed  to  be  a  homogeneously  broadened  collection  of  two-level  atoms  with 
transition  frequency  and  linewidth  a  is  the  absorption  rate  per  unit  length  experienced  by  the 


L 

^4 


2-0 


Fig.  1.  Ring  cavity  with  two  partially  reflecting  spherical 
mirrors  (I  and  2)  and  two  perfectly  plane  mirrors  (3  and  4). 

is  the  length  of  the  active  medium,  L  is  the  dist:.jce 
between  the  two  spherical  mirrors  and  ff  is  the  total  length  of 
the  cavity. 
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light  passing  through  the  medium.  An  external  coherent  field  is  injected  into  the  resonator  through 
mirror  1  and  we  assume  that  it  is  matched  with  the  fundamental  Gaussian  mode  of  the  cavity. 

In  the  paraxial  and  slowly  varying  envelope  approximations,  and  choosing,  in  accordance  with  the 
system’s  symmetry,  cylindrical  coordinates,  the  Maxwell  equation  for  the  slowly  varying  envelopes  F 
and  P  of  the  electric  field  and  ol  the  atomic  polarisation  is 


1  S  Id 

(p,  0  +  ^  Hr,  (p,z,t)  +  -  —  F{r,  tp,  z,  t)  =  aP{r,  tp,  z,  t) 


(1) 


where  k-„  =  (o^Jc  is  the  wave  vector  associated  to  the  frequency  of  the  input  field  and 

„2  15^ 

dr  r  dr  r  dtp 


(2) 


Our  study  is  based  on  the  assumption  that  the  empty  cavity  modes  are  suitable  also  to  describe  the 
transverse  structure  of  the  field  in  the  filled  cavity  [12].  These  modes  are  in  general  complicated 
functions  of  the  spatial  coordinates.  We  shall  consider  the  limit 

(3) 


where  is  the  Rayleigh  length  and  Wg  is  the  minimum  waist  of  the  beam.  In  this  limit  the  modal 
functions  assume  the  simple  form  of  the  Gauss-Laguerre  functions 


^pi(P>  +  1/1)1] 


(4) 


where  p  =  rlw^  is  the  normalised  radial  coordinate,  p  =  0,  1 , ...  is  the  radial  index,  /  =  0,  ±  1 , .  .  .  is  the 
angular  index  and  are  the  Laguerre  polynomials  of  the  indicated  argument.  The  functions  A^,  obey 
the  orthonormality  relation 

2'9r  oc 

j  dtp  j  dp  pA*pt{p,  tp)  Ap.,.(p,  tp)  =  Spp.Si,.  (5) 

0  0 


and  form  an  orthonormal  set  in  the  transverse  plane.  The  geometrical  parameters  of  the  cavity  if,  L 
and  Rq  determine  the  eigenfrequencies  of  the  resonator  according  to  the  formula  [12] 


•»npl 


c 

5 


217/1  +  2{2p  +  1/|  +  1)  cos 


- 1 


L\ 

R,  A 


(6) 


where  /i  =  0,  1,  2, .  .  .  is  the  longitudinal  index.  An  important  consequence  of  eq.  (6)  is  that  the 
frequency  of  the  GL  modes  depends  on  the  transverse  indices  p  and  /  only  via  the  combination  2p  +  |/|, 
thus  introducing  degeneracy.  The  modes  gather  in  degenerate  families,  labelled  by  the  index  q  = 
2p  +  |/|  as  shown  in  fig.  2.  In  the  following  we  shall  denote  the  transverse  modes  of  the  cavity  by  the 
couple  of  indices  (p,  /).  The  degenerate  family  of  order  q  consists  of  ^  +  1  modes.  In  the  literature 
mode  (0, 0)  is  usually  designated  as  TEMgg  and  modes  (0,  ±1)  are  called  TEMJ^,  hybrid  modes  or  also 
doughnut  modes  because  of  their  annular  interisity  profile. 
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Fig.  2.  Scheme  of  the  frequencies  of  the  eigenmodes  of  the  empty  cavity.  In  our  model  we  assume  that  the  separation  between 
longitudinal  modes  is  much  larger  than  the  separation  between  transverse  modes  with  the  same  longitudinal  index.  In  addition, 
only  few  transverse  modes  lie  under  the  atomic  line. 

At  this  point  we  consider  some  limiting  assumptions  that  allow  for  a  considerable  simplification  of  the 
analytical  description: 


(7) 

^nOl  ””  ^/lOO  ^  , 

(8) 

> 

A 

(9) 

(10) 

2C  =  aLJT  finite, 

(11) 

n  _  ^nOfl  ~  ^in  c_;. 

9  era  ■ 

(12) 

The  physical  meaning  of  eqs.  (7)  and  (8)  is  that  only  the  modes  belonging  to  the  longitudinal  family 
nearest  to  the  input  field  can  be  excited,  whereas  all  the  other  longitudinal  modes  do  not  play  any  role 
in  the  dynamics  (see  fig.  2).  Eq.  (9)  means  that  the  interaction  between  electric  field  and  atoms  is 
arbitrarily  small  and  gives  foundation  to  our  initial  assumption  that  the  source  term  in  the  Maxwell 
equation  can  be  treated  perturbatively.  Eq.  (10)  describes  an  almost  perfect  reflection  from  the 
spherical  mi.rors  so  that  the  field  remains  confined  in  the  cavity  long  enough  to  experience  significant 
nonlinearities  (the  cooperative  parameter  2C  does  not  vanish).  Conditions  (7)-(12)  correspond  to  a 
generalisation  of  the  well-known  mean-field  limit  {13|  and  allow  to  neglect  the  dependence  of  the 
electric  field  in  the  atomic  sample  on  the  longitudinal  coordinate  z. 

Taking  into  account  the  orthonormality  property  (5),  it  is  possible  to  expand  the  slowly  varying 
envelope  of  the  electric  field  F{p,  <p,  t)  in  terms  of  the  GL  functions  (4),  obtaining: 

^(P,  0  =  24/(0-4p;(p,  <P).  (13) 

p' 

where  the  modal  amplitudes  are  in  general  complex  functions.  We  assume  that  the  input  field  is 
matched  with  the  TEM,^,  mode  of  the  cavity  so  that  its  slowly  varying  envelope  can  be  written  as 


^in(p-  <p)=  >'^Oo(P.  <F)  . 


(14) 
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where  Y  is  the  normalised  amplitude,  which  is  taken  real  with  an  appropriate  choice  of  the  reference 
frequency.  Including  these  expressions  in  the  Maxwell  equation  for  the  electric  field  interacting  with  the 
atomic  medium,  making  use  of  the  approximations  we  have  introduced  and  projecting  onto  the  modal 
eigenfunctions,  we  obtain  a  set  of  integro-differential  equations  which  describe  the  temporal  evolution 
of  the  modal  amplitudes  [14]: 

271  -x. 

^  =  -k[il  +  ie  +  ia^,)f^,-Y8^  ,Ao  +  2C  f  dtp  f  dppA;,(p,,p)P{p,tp.t)],  (15) 

0  u 

where  k  =  cT/^  is  the  cavity  linewidth  and  a^,  denotes  the  difference  between  the  frequency  of  the 
mode  of  indices  p,  I  and  that  of  the  fundamental  TEM„„  mode,  normalised  to  k:  a ^1  =  {to„pi  -  w,,,,,,)  //c.  P 
is  the  normalised  slowly  varying  envelope  of  the  atomic  polarisation.  Eq.  (15)  must  be  coupled  with  the 
atomic  Bloch  equations,  which  read 

^  P 

=  y±(/^(p-  0  D{p,  <p,  /)-(!  +  'iA)P(p,  tp,  0] ,  (16) 

—  =  -7l|[Re(F*(p,  tp,  t)  P(p,  tp,  0)  +  Dip,  tp,  r)  -  1] ,  (17) 

where  D  is  the  normalised  population  inversion,  -yn  is  its  relaxation  rate  and  A  is  the  detuning  of  the 
atomic  transition  from  the  frequency  of  the  input  field,  normalised  to  J  =  {to^  - 
In  expansion  (13)  one  should  consider  a  priori  all  the  modal  amplitudes  fp,  with  p  =  0,  1, . .  .  and 
/  =  0,  ±1, . . .  .  However,  in  real  devices,  transverse  modes  of  high  order  (large  values  of  q)  usually 
suffer  from  high  diffraction  losses  due  to  the  finite  size  of  the  mirrors,  the  limited  diameter  of  the  active 
medium  and  the  presence  of  intracavity  elements  such  as  pinholes,  modulators,  etc.  Our  model  can 
describe  the  mode  selection  operated  by  such  devices  simply  by  limiting  the  expansion  (13)  to  those 
values  of  p  and  /  which  do  not  exceed  an  upper  limiting  value  of  the  family  index  q.  This  is  a  particular 
feature  of  optical  systems  that  allows  one  to  easily  control  the  number  of  modes  in  play,  both  in  the 
experimental  devices  and  in  the  dynamical  equations.  The  advantage  with  respect  to  other  physical 
systems  such  as  fluids,  where  the  number  of  modes  is  always  extremely  large,  is  evident. 


2.2.  Numerical  integration  of  the  dynamical  equations 

Before  proceeding  to  the  study  of  the  stationary  solutions  of  the  dynamical  equations  (15)-(17),  we 
describe  briefly  how  we  perform  their  numerical  integration.  We  first  of  all  discretise  the  spatial 
dependence  of  P  and  D  by  introducing  a  grid  of  points  (p„,,  tp^)  [m  =  \  ...  M-,  n  =  I  .  .  .  N]  in  the 
transverse  plane.  Points  tp„  are  equally  spaced  in  the  interval  [0,  Itr],  while  points  p„,  are  chosen  in  such 
a  way  that  2p^,  [/n  =  1  .  . .  A/j  are  the  zeroes  of  the  Laguerre  polynomial  of  order  M.  In  this  way  the 
double  integral  of  eq.  (15)  can  be  approximated  by  a  double-weighted  sum  over  the  indices  /  and  m 
(Gaussian  quadratures),  while  eqs.  (16)  and  (17)  transform  into  a  set  of  2M  ■  N  ordinary  differential 
equations  for  P„„it)  =  P(p„,  tp„,  t)  and  D^„it)  =  D(p„,  tp„,  t): 


d/ 


=  -k 


M  N 


(1  "*■  i^  “*■  ^^pl)fpli  ^^p.O^l.a  2C  2  2  plAPm'  ^n) 


(18) 
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=  -  (1  +  , 


=  -y\\[^eiF*„P„„)  +  -  1] , 


where 


F„AO  =  2A^,Xp^,tp„)f^,iO  (21) 

pi 

and  is  a  set  of  weights  for  Gaussian  quadratures. 

In  eqs.  (18)-(21)  the  doughnut  Gauss-Laguerre  functions  Ap,{p,  tp),  which  depend  on  the  angular 
variable  <p  via  the  exponential  factor  e'^'^  (see  eq.  (4)),  have  been  replaced  by  their  linear  combination 
Ap,j{p,  <p),  defined  in  the  following  way 

Apoip,  <p)  =  Apo(p,  <p) ,  (22) 

ApniP,  [^pi(P^  <P)  +  ^p-iiP>  <P)] 

-  i-;(2o")': ■■’’“*(/») .  (23) 

ApniP^  ^  ~  ^p-iiP^  <p)] 

= ^  «"('«■>  ■  (24) 

Accordingly,  the  angular  index  /  takes  only  non-negative  values  /  =  0,  1,  2  . .  . .  Since  the  Ap,,(p,  <p)  are 
real  functions,  this  substitution  allows  for  a  straightforward  separation  of  the  real  and  imaginary  parts  in 
eq.  (18),  thus  reducing  the  CPU  time  required  by  the  adaptive  Runge-Kutta  integration  method.  In 
sections  4  and  5  we  shall  refer  to  this  latter  modal  decomposition. 

2.3.  Stationary  solutions 

The  stationary  equations  are  obtained  by  setting  the  time  derivatives  in  eqs.  (15)-(17)  equal  to  zero. 
The  atomic  variables’  stationary  states  are  given  by 


poor  ^  (l-i^)F<"> 

^  1-h 


D^^'\p,  <p)  = 


j-  I  irOi)|2 


l  +  A^+  F 


where 


F^^'\p,  ip)  =  lA^,(p, 


By  substituting  the  expression  for  in  the  stationary  equations  for  the  modal  amplitudes  of  the  field, 
one  obtains 
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2‘n 


Y  =  f^\l  +  ie)  +  2Cil-iA)  dtpjdppj 


12,1^^  A  r(st)(2  J  p'l'  ’ 


0 

2‘n 


0  =  f;P(l+ia^,-hie)  +  2C(l-iA)  I,  [  dtpfdpp 

„■  /■  J  J 


A;,A.r 


f  (st) 

J  p'r  ’ 


1  +  4  +  A,„f 


(st)|2  J  p 
rm 


(27) 

(28) 


where  we  have  isolated  the  equation  for  the  TEM^o  mode  (the  only  one  that  experiences  the  driving 
field  y). 


2.3.1.  Singlemode  TEMgg  solution 

The  singlemode  solution  is  such  that  in  eq.  (26)  only  the  amplitude  fgg  of  the  TEMqo  mode  is  different 
from  zero,  and  is  an  exact  solution  of  eqs.  (27)  and  (28)  only  in  the  limit  in  which  it  is  possible  to 
neglect  all  the  modes  belonging  to  families  with  q^2. 

In  this  limit  the  dynamic  behaviour  of  the  system  is  governed  by  the  three  modes  (0,  0)  and  (0,  ±1) 
of  the  families  q  =  0  and  q  =  1,  that  are  described  by  the  modal  functions 


^i{p)  =  ^oo(p)=  \/~e 

(29) 

^2(P.  <P)  =  >loi(p.  <P)  =  -^  P  e'"'  , 

(30) 

^3(P.  <P)  =  >io-i(P>  <P)  =  ^  P  e"*'"  , 

(31) 

and  the  electric  field  can  be  written  as 

3 

F{p,  tp,t)  =  ^  fi{t)AXp,  (p)- 

(32) 

1=1 


It  is  easy  to  show  that  the  choice 

f  =  f  r>  r (st)  _  ^(st)  _  « 

Jl  —JoO  Jz  —JOI  73  —JO-t 

satisfies  eqs.  (27)  and  (28).  In  particular,  eq.  (27)  becomes 


y  =  /<“>(!  +  id)  +  2C(1  -  i4)2ir  j  dp  p- 


M,(P)P 


+  A  +\A^{p)f^ 


(St)  1 2 


(st) 


(33) 


By  multiplying  each  side  of  eq.  (33)  by  its  complex  conjugate  one  obtains  an  expression  that  links  the 
modulus  of  the  output  field  x,  =  V^/ir  to  that  of  the  input  field  y,  =  VJTir  y  [15]: 


r/.  2c 

l  +  A^  +  xW  , 

(  2CA  .  1  +  + 

I  1  +  -y  In 

- —I  +1 

LV 

1  +  A^  > 

\  x]  1  +  A^/l 

1/2 


(34) 


The  corresponding  steady  state  curve,  for  suitable  sets  of  parameters,  displays  the  typical  S-shape  of 
optical  bistable  systems. 
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2.3.2.  Multimode  1=0  solutions 

When  the  modes  belonging  to  families  with  q^2  arc  included  in  the  model  the  singlemode  stationary 
solution  described  in  the  previous  section  does  not  exist  any  longer,  for  the  integrals  that  appear  in  eqs. 
(27)  and  (28)  couple  mode  (0,  0)  with  all  the  modes  (p,  /  =  0). 

Let  us  consider  for  example  what  happens  if  we  include  in  the  model  the  modes  of  the  family  q  =  2. 
They  are  described  by  the  modal  functions 


^4ip)  =  ^toip) 


-f- 

>  -TT 


(l-2p^)e-  , 


A^{p,  tp)  =  >lo2(Pi  =  ^  P^  e''”'  e-“^ 


that  must  be  added  to  those  of  eqs.  (29)-(31),  so  that  the  total  electric  field  is 
6 

^(P>  0  =  S  fi(t)AiiP,  <P)  ■ 


The  multimode  /  =  0  solution  is  in  this  case  the  solution  such  that  only  the  modal  amplitudes =/oo' 
and /4*‘^  do  not  vanish  and  their  values  are  given  by  the  following  equations: 

.rs.w. _ _ _ _ _  L  i>i,(p)r/r+^.(p)M4(p)/r’ 

K  =  /<  '(l  +  ,»)+2C(l-.4)2,J  (39) 

A  (1  +  >«.o  +  >«2  +  2C(1  id)2TrJ  dpp  .  (40) 

Since  all  the  modes  with  /  =  0  are  independent  of  the  angular  coordinate  <p,  the  intensity  profiles  of 
these  kind  of  solutions  possess  cylindrical  symmetry  and  they  are  not,  in  general,  very  different  from 
that  of  the  TEMqo  solution,  for  the  contributes  of  the  modes  with  p  ^  1  are  usually  small. 


2.3.3.  Generic  multimode  solutions 

Other  multimode  solutions  exist  in  which  modes  with  /  are  present.  For  these  solutions  the  r.h.s.  of 
eqs.  (27)  and  (28)  are  very  complicated  implicit  functions  of  the  /p,”  and,  in  general,  also  their 
numerical  solution  is  very  difficult,  so  that  it  is  easier  to  integrate  numerically  eqs.  (15)-(17)  and 
analyse  the  steady  state  they  reach  at  regime. 

The  intensity  profile  of  one  of  these  solutions  is  shown  in  fig.  3.  In  this  case  the  parameters  are 
C  =  100,  A  =  10,  6  =  -2,  floi  ~  4,  k/yj^  =  1,  =  1  and  the  modes  belonging  to  families  q  =  0,  1,  2,  3, 

4  are  active.  As  one  can  see,  the  presence  of  modes  with  angular  index  I  difference  from  zero  causes  a 
breaking  of  the  cylindrical  symmetry  in  the  intensity  profile. 

Even  though  the  stationary  configurations  we  have  described  so  far  can  be  represented  by  linear 
combinations  of  the  empty  cavity  eigenmodes,  it  must  be  noted  that,  in  general,  it  is  not  true  that  any 
linear  combination  of  such  modes  is  a  stable  solution  of  the  dynamical  equations.  Because  of  the 
nonlinear  interaction  among  the  atoms  and  the  electric  field,  which  is  manifest  in  eqs.  (27)  and  (28),  the 
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system  selects  only  particular  linear  combinations  of  the  modes,  that  represent  fixed  points  in  the  phase 
space  whose  coordinates  are  the  real  and  the  imaginary  parts  of  the  model  amplitudes  /p,.  The  same 
picture  in  terms  of  attractors  holds  for  the  dynamical  regimes  that  we  describe  in  the  following  section, 
where  the  modal  amplitudes  are  functions  of  time. 

2.4.  Destabilisation  of  the  TEMgo  single  mode  steady  state 

In  this  section  we  restrict  our  analysis  again  to  the  three  modes  model  and  describe  how  the  onset  of 
the  modes  of  family  q  =  \  can  destabilise  the  singlemode  TEMqo  stationary  solution  (34),  giving  rise  to 
more  complex  spatiotemporal  structures. 

In  order  to  check  the  stability  of  the  singlemode  stationary  solution  we  consider  a  small  perturbation 


3 

6F(p,  <p,  0  =  2  Ai{p,  <pWiit)  =  F(p,  <p,  t)  -  <p) 

1=1 


=  F(p,  tp,  0-A.(p)/f‘>, 

(41) 

8P{p,  tp,  t)  =  P{p,  tp,  t)-P^^'\p,  tp). 

(42) 

8D(p,  tp,  t)  =  D{p,  tp,  t)  -  D^^'\p,  tp) 

(43) 

and  linearise  the  dynamical  eqs.  (15)-(17)  around  the  steady  state.  Then  we  introduce  the  usual 
exponential  Ansatz 

mo  I  \ 

dfHo  .  5/r 

8P{p,<p,t)  =e*’'"'  8P\p,<p)  , 

8P*(p,(p,t)  8P°*(p,(p) 

\8D(p,(p,t)l  \8D\p,(p)l 


(44) 
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obtaining  a  closed  set  of  linear  equations  for  the  variables  5/°,  5/“*,  5/“,  S/"*. 

The  condition  for  the  existence  of  a  non-vanishing  solution  of  these  equations  leads  to  a  transcenden¬ 
tal  secular  equation  in  A : 


X/k  =  -1  -  C(0(x,,  y.  A,  A)  -I-  <P(x,,y,  A,  -d)) 

±  {4CV(x,,  r,  A,  y.  A,  -A)x* 

-  [e  +  Oot  +  C(0(x,,  y.  A,  A)  -  d»(x,,  y.  A, 

where 

k  =  kly^,  5'  =  y||/yx. 


X 

^{x^ ,  y,  A,  d)  =  I  dp 


4p^e-^ny(A+2)(l-id)] 


(l-t-d^-t-x^e’"'’  ){(y-^A)((l-l-Ar-hd^]+y(l-HA)x;e-^''  } 


(45) 

(46) 

(47) 

(48) 

The  existence  of  a  root  of  eq.  (45)  with  Re  A  >0  indicates  the  instability  of  the  (0,  0)  solution.  Eq. 
(45)  can  be  solved,  in  general,  only  numerically.  Here  we  consider  two  particular  cases  in  which  the 
treatment  can  be  considerably  simplified  and  the  instability  boundaries  in  the  parameter  space  can  be 
easily  calculated. 

A  first  simplified  approach  to  the  stability  analysis  consists  in  observing  that,  as  eq.  (45)  clearly 
shows,  the  eigenvalue  A  is  proportional  to  k.  Hence,  if  k  is  small,  one  can  solve  perturbative ly  the 
eigenvalue  problem  by  assuming  that 


A  =  Ao-(-*A,  -t-<7(P). 


(49) 


We  note  that  the  self-consistency  of  this  method  requires  that  the  frequency  spacing  between  mode 
(0, 0)  and  the  input  field  is  of  the  order  of  k,  while  the  frequency  spacing  between  all  other  modes  and 
the  input  field  is  of  the  order  of  y^  [12].  In  other  terms,  we  require  that 

=  0{\),  V  =  Afloi  =  ~  =  g(l)  •  (50) 

A 

The  calculation  of  the  first  two  terms  of  the  expansion  (49)  yields 

Ao=+ii7,  (51) 

A,  = -2C0(x,,  y,  A  =  Aq,  d)- 1  ±id  .  (52) 

As  ^  is  purely  imaginary,  the  instability  condition  amounts  to 
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Fig.  4.  Instability  domain  of  the  singlemode  TEM,,,  solution 
in  the  (jt,,  17)  plane  in  the  case  k  for  C  =  600,  A  =  20. 
■y  =  1.  In  correspondence  to  the  solid  line  the  real  part  of  the 
eigenvalue  vanishes.  Point  A  indicates  the  values  of  jr,  and  rj 
that  have  been  used  in  hg.  6. 


Fig.  4  shows,  in  the  parameter  s.  bspace  (jc,,  17),  the  instability  boundary  Re  A,  =0,  surrounding  the 
unstable  region.  The  onset  of  this  instability  is  marked  by  the  change  of  sign  in  the  real  part  of  a 
complex  eigenvalue  (Hopf  instability). 

Another  kind  of  instability  occurs  when  both  Re  A  and  Im  A  vanish  simultaneously.  The  boundary  of 
the  instability  domain  are  found  by  setting  A  =  0  in  eq.  (45)  and  no  assumption  is  made  on  the  value  of 
k.  It  is  worth  noting  that  this  procedure  amounts  to  revealing  the  existence  of  Turing  instabilities  [16]. 
In  fig.  5  the  solid  line  encloses  the  instability  domain  in  the  parameter  space  (x,,  Ooi)’  other 
parameters  have  been  fixed  at  the  values  specified  in  the  captions. 


Fig.  S.  Instability  domains  of  the  singlemode  TEMgo  solution  in  the  (x,,  <i„,)  plane.  In  correspondence  to  the  solid  line  both  the 
real  and  the  imaginary  part  of  the  eigenvalue  vanish.  The  values  of  the  other  parameters  are  d  =  10.  9  =  -2.  y  =  1  and  (a) 
C  =  100,  (b)  C  =  ISO.  The  meaning  of  the  lettering  is  explained  in  the  text. 
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3.  Dynamical  regimes 

In  this  section  we  describe  the  various  regimes  that  have  been  observed  in  the  analysis  of  the 
dynamical  behaviour  of  our  system.  These  regimes  are  rather  different  in  the  case  k  <  y  .  yi  (good 
cavity  limit)  and  in  the  case  /c  ~  yi  ~  7||,  therefore  we  treat  the  two  cases  separately. 

3.1.  Good  cavity  limit 

This  limit  is  defined  by  the  condition  ^^1.  In  this  situation  the  instability  condition  for  the 
singlemode  TEM^o  solution  is  given  by  eq.  (53).  We  have  observed  that  the  patterns  that  arise  from  this 
instability  are  always  characterised  by  the  fact  that  only  the  (0,  0)  mode  and  one  of  the  (0,  ±1) 
doughnut  modes  contribute  to  the  electric  field.  Their  intensities  are  constant,  but  each  of  them 
oscillates  with  its  own  frequency.  If  we  assume  for  definiteness  that  the  doughnut  mode  with  positive 
helicity  is  active,  the  electric  field  has  the  form 

F(p,  <P,  0  =  (/,  +  pV2f,  e-' ,  (54) 

where  t  =  yj  and  17  is  the  difference  between  the  frequencies  of  the  two  families,  normalised  to  y  .  Eq. 
(54)  yields  the  following  expression  for  the  field  intensity; 

\F(p,  <P,  01'  =  I  [!/,!'  +  2p^|/T  +  2V2p|/,|  I/3I  cositp  -nr-  <Po)l  e''"'  ,  (55) 

71 

where  ^  is  the  phase  lag  between  /,  and  at  time  f  =  0.  Eq.  (55)  clearly  describes  a  counterclockwise 
rotating  pattern;  the  rotation  is  clockwise  if  the  doughnut  with  negative  helicity  is  active.  The  dynamics 
of  the  modulus  of  the  field  is  depicted  in  the  sequence  of  fig.  6,  obtained  with  the  following  set  of 
parameters:  C  =  600,  A  =  20,  y  =  l,  tj  =  1,  x,  =50,  (corresponding  to  point  A  in  fig.  4)  0  =  30  and 
k  =  0.01. 

As  one  expects  from  the  inspection  of  fig.  6,  there  exists  a  point  in  the  transverse  plane  where  the 
electric  field  intc.isity  vanishes.  Such  a  point  is  a  singularity  for  the  electric  field  phase  and,  therefore,  it 
can  be  regarded  as  an  optical  vortex,  in  the  sense  of  ref.  [3],  This  jx)int  is  easily  identified  in  eq.  (55)  as 
the  one  having  polar  coordinates 


Fig.  6.  Evolution  of  the  modulus  of  the  field  for  a  choice  of  the  parameters  corresponding  to  point  A  in  fig.  4.  The  rotation  of  the 
whole  structure  and  the  presence  of  a  vortex  are  clearly  visible. 
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Pv  = 


ALL 

V2|/,|  ’ 


<Pv  =  ■^  +  'Po  +  ^  ■ 


(56) 


Obviously,  the  vortex  moves  on  a  circular  orbit  around  the  optical  axis  with  angular  frequency  tj.  The 
existence  of  the  same  dynamical  pattern  has  been  theoretically  predicted  in  active  systems  [17]  and  is 
presently  being  experimentally  investigated  by  the  group  of  Weiss  and  collaborators. 


3.2.  The  case  k~  y^~  yy 

A  rich  variety  of  dynamical  patterns  has  been  found  in  this  case.  In  fig.  5  the  solid  lines  show  the 
domains  of  instability  predicted  by  the  analysis  of  the  three-modes  model.  We  have  used  these  results  as 
a  starting  point  for  our  study  based  on  the  integration  of  eqs.  (15)-(17)  up  to  family  q  =  2  (i.e.  we 
considered  the  six-modes  model).  Our  results  are  illustrated  schematically  in  the  same  figure,  where  we 
indicate  the  multimode  /  =  0  stationary  solutions  with  S,  and  the  generic  multimode  stationary  solutions 
with  S2.  The  symbol  Pn,  with  u  =  1,  2,  4,  8,  refers  to  regimes  where  the  total  intensity  is  periodic  with 
period  n,  whereas  Cl  and  C2  denote  two  different  chaotic  regimes.  As  one  can  see  there  exist  extended 
areas  of  chaotic  behavior.  Period  doubling  routes  have  been  identified  for  both  regimes  Cl  and  C2. 
Conversely,  when  the  mode  separation  is  increased  the  systems  tends  to  reach  the  SI  configuration, 
because  all  other  higher  order  families  are  disfavoured.  The  presence  of  unstable  behaviour  outside  the 
instability  domain  and,  vice  versa,  of  stable  regimes  inside  it  is  not  surprising,  for  we  are  comparing  the 
numerical  results  relative  to  the  six-mode  model  with  the  theoretical  predictions  of  the  three-mode 
model,  which  is  the  only  one  to  be  analytically  treatable. 

Fig.  7  illustrates  a  period-4  behaviour  which  is  observed  for  the  parameters  values  of  fig.  5b,  with 
flo,  =  13.5  and  jc,  =  20.  Figs.  7a  and  7b  show  the  projections  of  the  trajectory  onto  the  planes 
Re  /oo-Im  /oo  and  Re  /oi-Im  /(,,,  respectively. 


Fig.  7.  Example  of  periodic  behaviour:  (a)  projection  of  the  trajectory  on  the  (Re  Im  f„,)  plane;  (b)  projection  of  the 
trajectory  on  the  (Re  Im  /„)  plane. 
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Fig.  8.  Example  of  Silnikov-iike  chaotic  behaviour:  (a)  projection  of  the  trajectory  on  the  (Re/oo,  lm/„„)  plane;  (b)  projection  of 
the  trajectory  on  the  (Re  Im  /„,)  plane,  (c)  temporal  evolution  of  Re 

A  second  type  of  chaotic  behaviour  has  been  observed  for  the  parameters  of  fig.  5b  (with  x,  =  60)  and 
for  several  values  of  Oq,  in  the  range  8.5  <  <  14.  In  this  case,  the  trajectory  remains  for  a  long  time 

interval  (extending  for  approximately  50  r-units)  on  a  regular  path.  Correspondingly, drifts  along  the 
solid  line  of  fig.  8a,  while /q,  remains  equal  to  zero,  as  shown  in  figs.  8b,  8c.  Then,  suddenly,  the  system 
experiences  an  abrupt  push,  subsequently  to  spiral  back  to  the  regular  orbit,  a  behaviour  reminiscent  of 
Silnikov  chaos  [18]. 

The  chaotic  behaviour  indicated  in  fig.  5b  with  Cl  is  observed  also  for  parameter  values  within  the 
experimental  reach.  An  extensive  analysis  of  this  regime  is  presented  in  the  following  section. 

4.  Study  of  the  chaotic  states 

In  this  section  we  study  extensively  the  behaviour  of  the  system  in  the  chaotic  regime,  having  as  a 
goal  the  understanding  of  the  relation  between  spatial  pattern  formation  and  temporal  irregular 
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dynamics.  To  this  end  we  concentrate  on  the  case  A:  1,  y  =  1,  C  =  100,  J  =  10,  0  =  -2,  a,,,  =  9,  a:,  =  30 
(cf.  fig.  5a),  which  we  study  for  increasing  number  of  active  GL  modes.  Namely,  we  consider  in  the 
expansion  (13)  the  families  q^2,  q^3  and  q^4,  respectively.  In  all  three  cases  we  peformed  the 
numerical  integration  of  eqs.  (18)-(20)  and  collected  the  time  series  of  the  expansion  coefficients  fp,{t) 
of  eq.  (21).  In  the  analysis  of  extended  systems,  a  complete  characterisation  of  the  dynamics  may 
require  the  recording  of  huge  amounts  of  data,  an  obstruction  that,  in  many  instances,  can  preclude  the 
quantitative  understanding  of  the  system’s  behaviour.  In  our  case  the  mode  expansion  offers  the 
advantage  of  synthesising  the  space  information,  thus  avoiding  all  problems  related  with  the  choice  of  a 
proper  spatial  sampling.  Nevertheless,  in  order  to  perform  analyses  which  have  become  standard  in  the 
theory  of  low-dimensional  chaos,  one  has  to  record  several  hundred  thousand  data  points  per  expansion 
coefficient,  which  brings  very  soon,  for  increasing  q,  to  the  limit  capacity  of  currently  available 
mass-storage  devices.  The  analyses  discussed  in  the  following  are  based  on  three  data  sets:  for  <7  ^  2  we 
recorded,  for  each  of  the  6  complex  expansion  coefficients,  n  =  \  138  000  points,  at  a  rate  of  one  point 
each  At  =  0.15;  for  q^3,  for  each  of  the  10  complex  coefficients,  n  =  300 000  points  with  At  =  0.3;  for 
q  ^4,  for  each  of  the  15  complex  coefficients,  n  =  300000  points  with  At  -  0.3.  The  integration  of  the 
dynamical  equations,  carried  out  on  an  Alliant  FX8  vector-parallel  minicomputer,  required  several  days 
of  CPU  time. 


4.1.  Characterisation  of  the  chaotic  time  behaviour 

In  this  section  we  present  the  results  of  the  analysis  of  the  time  series  fp,{t),  whose  first  step  is  the 
determination  of  the  low-dimensional  nature  of  the  dynamics. 

A  first  insight  into  the  system’s  behaviour  can  be  obtained  by  inspecting  the  projections  obtained  by 
plotting  the  nth  attractor  point  versus  the  («  +  l)st,  for  each  of  the  time  series.  The  existence  of  a 
relation  between  the  shape  of  the  attractor  and  the  spatial  configuration  of  the  electric  field  is  apparent. 
Independently  of  how  many  modes  are  active,  the  attractor’s  shape  is  mainly  determined  by  the  parity 
and  magnitude  of  the  angular  index  /,  whereas  the  radial  coefficient  p  and  the  phase  coefficient  i  have 
little  influence.  Typical  attractor  shapes  are  seen  in  fig.  9,  for  q^4  and  time  interval  between 
subsequent  points  At  =  0.6.  Real  and  imaginary  parts  are  plotted  separately.  We  notice  that  the 
attractors  related  to  modes  with  odd  /’s  have  the  shape  of  symmetric  (with  respect  to  both  diagonals) 
loops.  An  increase  in  |/|  is  accompanied  by  a  higher  density  and  a  more  complicated  folding  in  the  areas 
adjacent  to  the  origin.  If  /  is  even  the  attractors  have  no  evident  symmetry  properties  and  become  very 
irregularly  dense  for  increasing  |/1. 

The  relation  spatial  modal  indices  -  time  behaviour  of  the  amplitudes  becomes  even  more  evident 
studying  the  probability  distributions  of  the  values  /^/(O  illustrated  in  fig.  10.  The  shape  of  the 
histograms  is  again  rather  independent  of  q,  p  and  /,  whereas  it  depends  strongly  on  the  parity  and  the 
modulus  of  /.  While  the  considerations  we  have  made  with  reference  to  the  symmetry  properties  of  the 
attractors  extend  quite  obviously  to  the  probability  distributions,  it  is  here  remarkably  clearer  how  an 
increase  in  |/|  causes  the  histograms  to  be  sharply  peaked  around  the  origin.  Thus,  modes  with  high 
angular  coefficients  are  governed,  for  the  given  parameter  values,  by  an  intermittent  dynamics.  Further 
comments  about  the  time  fluctuations  of  modal  amplitudes  will  be  made  in  section  4.2. 

As  mentioned  in  section  2.2,  time  series  fp,{t)  are  actually  the  result  of  the  numerical  integration  of  a 
set  of  very  many  coupled  nonlinear  ordinary  differential  equations  (in  our  case  972,  980  and  990 
equations  iox  q^l,  q^3  and  ^  ^ 4,  respectively).  In  order  to  investigate  the  low-dimensional  nature  of 
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Fig.  9.  Two>diniensional  projections  of  the  Gauss-Laguerre  coefficients for  ^ 4.  A:  =  1 ,  y  =  \,  C=  10(),  d  =  10,  ft  =  -2, 
floi  -  9,  jCj  =  30.  Each  attractor  point  is  projected  against  the  preceding  one.  The  coefficients  are  identified  by  the  indices  p,  I  and. 
if  /#0,  i;  real  and  imaginary  parts  are  plotted  separately.  Each  figure  contains  5000  points  sampled  at  a  rate  of  one  point  each 
A/ =  0.6. 
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the  irregular  time  dynamics  we  have  proceeded  to  the  measurement  of  fractal  dimensions  and  entropies 
[20,  19],  adopting  the  constant-mass  method  [21,  22]. 

The  first  set  of  results,  illustrated  in  fig.  11,  was  obtained  by  considering  the/p,(0  as  coordinates  in  a 
phase-space  whose  dimension  is  the  number  of  the  active  GL  modes  (actually,  twice  this  number,  since 
real  and  imaginary  parts  are  considered  separately).  The  information  dimension  D(l)  [19]  is  plotted  in 
fig.  11a  for  ^  2,  in  fig.  lib  for  ^  ^  3  and  in  fig.  11c  for  ^  ^  4  versus  the  number  of  considered  attractor 

points  n  (given  in  powers  of  b  =  1.18)  for  several  orders  of  nearest-neighbours  (n  -  n)  A:.  Fig.  1 1  should 
be  understood  in  the  following  way.  In  each  of  the  three  cases,  m  =  8192  points  were  chosen  on  the 
attractor,  at  random  with  respect  to  the  natural  measure,  as  reference  points.  D(l)  is  determined  by 
measuring  the  exponential  scaling  of  the  average  distance  of  the  kih  n  -  n  with  n,  by  means  of  a 
regression  procedure  carried  out  in  an  n-range  of  12  powers  of  b  [22].  These  plots  allow  one  t  study  at 
the  same  time  the  convergence  with  both  n  and  k:  in  the  cases  (a)  and  (b)  all  n-n  distances  have  reached 


356 


M.  Brambilla  et  al.  /  Spatiotemporat  patterns  and  chaos  in  passive  optical  systems 


their  asymptotic  scaling  for  the  available  w-values  [in  (a)  the  typical  fluctuations  due  to  lacunarity  have 
already  set  in],  whereas  in  (c)  only  the  lowest-order  n-n  distances  have  converged  (as  one  expects 
because  of  the  increased  phase-space  dimension).  In  all  cases  the  convergence  is  satisfactory:  giving  a 
higher  weight  to  lower-order  n-n’s  one  measures:  (a)  D(l)  =  2.6,  (b)  0(1)  =  2.3,  and  (c)  D(l)  =  2.5,  the 
errors  being  smaller  than  0.2  in  all  three  cases.  The  phenomenon  is  therefore  unequivocably  low¬ 
dimensional  and  2  <  D(l)<3.  Relevant  to  us  is  the  absence  of  any  systematic  increase  in  the 


Fig.  10.  Probability  distributions  of  the  Gauss-Laguerre  mode  amplitudes  /,,,(/).  for  the  eases  of  tig.  9,  The  histograms  have  been 
constructed  by  distributing  250  000  values  over  512  bins. 
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information  dimension  with  the  number  of  active  GL  modes,  that,  therefore,  cannot  be  easily  related 
with  the  degrees  of  freedom  governing  the  system’s  time  dynamics. 

The  second  set  of  results  concerns  measurements  of  the  information  dimension  £)(1)  and  of  the 
Kolmogorov  metric  entropy  K(l)  [23]  of  the  single  time  series  fp,it)  and  of  the  total  electric  field 
intensity  |F(t)|^  =  1.^,  fp,{t)  after  time-delay  reconstruction  in  a  suitable  ^-dimensional  embed¬ 

ding  space.  Our  scope  was  twofold.  On  the  one  side,  we  wanted  to  check  that  our  numerical  procedure 
for  the  integration  of  the  dynamical  equations  preserved  the  correct  coupling  of  all  variables:  this  has 
been  clearly  proved,  for  all  signals /p,(t)  have  shown  to  have  the  same  dimension  and  entropy.  On  the 
other  side,  we  were  interested  in  testing  the  suitability  of  the  total  intensity  |f(/)|‘  as  scalar  observable 
for  the  investigation  of  the  time  dynamics,  since  this  is  by  far  the  easiest  quantity  to  measure  in  an 
experiment.  Moreover,  if  in  future  numerical  integrations  many  more  modes  will  be  considered,  it  will 
be  impossible,  because  of  storage  space,  to  record  long  sequences  of  all  the  for  the  measure¬ 

ments  of  the  scaling  properties  of  the  time  attractors.  Therefore,  we  are  interested  in  showing  that  these 
latter  measurements  can  be  performed  on  |F(t)|^.  The  comparison  between  the  values  of  figs.  11  and 
i2a  provides  evidence  that  this  is  indeed  possible. 

A  very  satisfactory  convergence  to  a  finite  value  is  shown  also  by  the  measured  metric  entropy  K(  1 ) 
(fig.  12b).  Our  result,  namely  A'(l)  =  0.18  ±  0  should  be  compared  with  that  for  the  Lorenz  model, 
K(l)  =  1.1,  which  indicates  that  a  slower  average  exponential  separation  occur  in  our  case  than  on  the 
Lorenz  attractor. 
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Fig.  11.  Information  dimension  D(l)  for  (a)  q^2,  (b)  <7  =s3,  (c)  (other  parameter  values  as  in  fig.  9)  versus  the  number  of 
considered  attractor  points  n  [in  (a)  n  «284672,  in  (b)  and  (c)  n  ^  126947).  The  x-axis  scale  is  given  in  powers  of  6  =  1.18.  Each 
curve  represents  the  behaviour  of  different-order  nearest-neighbour  distances  k.  namely  k  -5  (squares).  ^  =  10  (circles).  Hr  =  15 
(triangles).  Hr  =  20  (crosses),  A:  =  25  (St.  Andrew’s  crosses),  =  30  (diamonds),  fe  =  35  (Y's). 

Since  the  system’s  asymptotic  motion  takes  place  on  an  attractor  of  dimension  £)(!)<  3,  if  follows 
that  some  of  the  GL  coefficients  fp,it)  must  necessarily  show  non-decaying  time  correlations.  In  the 
following  we  discuss  the  results  of  a  systematic  investigation  of  the  linear  correlation  functions 

T 

Cpi.pA'^)  =  ^lim^  ^  (57) 

-r 

for  sets  of  modal  indices  coinciding  (autocorrelation)  or  distinct  (cross-correlation). 

Let  us  consider  first  the  autocorrelations:  we  have  observed  two  typical  time-decay  modes,  the  first  of 
which  (fig.  13a)  is  displayed,  independently  of  the  values  of  the  other  indices,  by  the  coefficients  of  GL 
modes  having  odd  /,  and  the  second  (figs.  14a  and  14b)  by  those  having  even  /.  The  former  is 
characterised  by  a  very  fast  decay,  followed  by  oscilations  of  very  small  amplitude  and  periods 
distributed  around  the  average  period  of  the  GL  mode  At  =  5  (fig.  13a,  cf.  also  the  behaviour  of  the 
autocorrelations  of  the  x  and  y  variables  of  the  Lorenz  model).  These  oscillations  generate  the  peaks  in 
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D(l) 


Fig.  12.  Information  dimension  D(l)  (a)  and  Kolmogorov  metric  entropy  K(l)  (b)  of  the  total  electric  held  intensity  |F(r)|'  versus 
embedding  dimension  E,  for  q^2  (squares),  q«3  (circles),  q^4  (triangles)  (other  parameter  values  as  in  hg.  9).  The 
measurements  have  been  performed  using  m  =  8192  reference  points,  n  =  284  672  data  points  and  considering  the  scaling  of  the 
5th  n-n  distances.  A:(l)  is  in  units  of  At  =  0.6. 


the  power  spectrum  around  /  =  0.03  (fig.  13b).  The  latter  decay  mode  is  characterised  by  large 
oscillations,  again  having  period  At  ~  5  (fig.  14a,  cf.  also  the  behaviour  of  the  autocorrelation  of  the 
z-variable  of  the  Lorenz  model).  Their  envelope  presents  first  an  exponential  decay,  with  rate  »250, 
followed  by  persisting  oscillations  of  amplitude  =0.1  and  period  At^  =«5(X)  (fig.  14b).  Correspondingly, 
many  low-frequency  peaks  are  visible  in  the  power  spectrum  (fig.  14c). 


Fig  13.  (a)  Autocorrelation  function  of  the  real  part  of  Gauss-Laguerre  coefficient  f„y{t).  for  i  =  1,  q  «  2,  computed  over  131  072 
data  points,  (b)  Power  spectrum  of  the  same  variable,  computed  over  I  048  072  data  points.  The  Jt-axis  scale  is  determined  by  the 
sampling  time  Al  =  0.15  (i.e.  /=0.5  corresponds  to  an  oscillation  of  period  At  =  0.3). 


Fig.  14.  (a),  (b);  Autocorrelation  function  of  the  real  part  of  Gauss-Laguerre  coefficient  for  /'  =  1,  s2,  computed  over 

131 072  data  points,  (c):  Power  spectrum  of  the  same  variable,  scales  as  in  fig.  13. 


Fig.  15.  Cross-correlation  function  of  the  real  parts  of 
Gauss-Laguerre  coefficients  /„,(t)  and  for  /  =  I.  ^  s  2. 
computed  over  131  072  data  points. 
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Whereas  the  cross-correlations  of  the  couples  Re  (/p,)-Ini  (/^,)  decay,  according  to  the  value  of  /,  in 
the  same  way  we  have  described  with  reference  to  the  autocorrelations,  we  observe  that  in  general 
cross-correlations  of  generic  coefficients  present  irregular  oscillations  (distributed  around  At  — 5)  that 
do  not  decay  in  amplitude  (fig.  15),  with  the  exception  of  the  case  in  which  both  modes  have  angular 
coefficients  /  of  the  same  parity.  In  this  latter  case  they  decay  as  in  figs.  13a  and  14a,  14b,  for  /  odd  or 
even,  respectively.  The  analysis  of  the  linear  correlation  functions  has  confirmed,  therefore,  that 
systematic  correlations  among  the  are  present  also  in  the  case  of  chaotic  time  evolution  and  that 

the  nature  of  these  correlations  depends  on  the  symmetry  of  the  spatial  patterns  arisen  from  the 
symmetry-breaking  instabilities.  In  the  following  we  proceed  to  the  investigation  of  these  relations  by 
evaluating  the  degree  of  spatial  coherence  of  the  system. 


4.2.  Spatiotemporal  effects 

The  relations  between  space  patterns  and  time  behaviour  evidenced  in  the  preceding  section  suggest 
to  study  indicators  depending  on  both  the  dynamical  regime  and  the  transverse  field  configuration  of  the 
system.  We  introduce  therefore  the  degree  of  spatial  coherence  of  the  (complex)  variable  V(r,  t): 

r  r  3  =  lim^^..(l  /2r)  jlr  dt  Vjro,  t)  V*ir,  t) 

Vmr^Uy2T)r-rd^\V(r„t)\^  ’ 

where  r„  is  the  position  of  a  reference  point  (the  origin,  in  our  case)  and  r  is  the  position  vector  in  the 
transverse  plane.  Notice  that  a  laser  oscillating  in  a  single  transverse  mode  has  perfect  spatial  coherence 
(i.e.,  r=  1).  As  variable  V  we  have  chosen  F(p,  4>,  t)  -  {F{p,  <f>,  t))  {{ • )  indicates  time  averaging) 
and  \F(p,  <f>,  t)\^  -  {\F(p,  4>,  t)\^);  the  latter  quantity  turned  out  to  be  the  most  convenient,  also  in  view 
of  the  feasibility  of  a  direct  measurement  of  Tin  a  future  experiment  (e.g.  averaging  over  photographic 
frames  taken  at  different  times).  In  fig.  16  r{x,  y)  is  displayed  for  (a)  <7  ^2  and  (b)  ^  ^4.  In  order  to 
allow  for  an  easier  comparison,  the  x  and  y  coordinates  have  been  normalised  to  the  mode  maximum 
spatial  extent  L,  defined,  with  some  arbitrariness,  as  the  value  of  the  radial  coordinate  beyond  which 
the  modulus  of  any  Ap^  does  not  exceed  10“ (L  -5.17  for  ^^2  and  L  -  5.52  for  q  ^4).  In  both  cases 
r  has  a  very  regular  bell-shape,  not  very  different  from  the  one  obtained  for  periodic  motion  in  time. 

As  it  is  seen  from  fig.  16,  the  reduction  in  extent  of  the  correlation  region  is  of  few  percent  upon 
doubling  of  q^ax-  Th*®  ^  further  indication  that  the  predominance  of  low-order  modes  and  the 
presence  of  relevant  correlations  among  mode  coefficients  prevents  the  system  from  losing  spatial 
coherence  even  when  many  modes  become  active  and  the  time  dynamics  chaotic.  If  r(jc,  y)  is 
compared,  using  the  same  parameter  values,  for  a  simulated  evolution  in  which  the  /^,(/)’s  are 
substituted  by  Gaussian  random  numbers,  its  behaviour  loses  smoothness  and  the  size  of  the  central 
peak  is  reduced  by  a  factor  2.  In  a  similar  simulation  with  ^  «  16  the  shape  of  r(jc,  y)  is  already  very 
irregular,  and  the  (vaguely-defined)  correlation  length  ^(<f>)  satisfies  ^(<^)<  11 T,  V</>.  Because  of  the 
presence  of  the  time  correlations  in  the  fpiitfs  discussed  in  the  preceding  section,  the  value  of  q^^^  for 
which  we  expect  the  system  to  show  a  similar  behaviour  is  much  higher.  Moreover,  one  should  not 
forget  that,  hitherto,  we  have  discussed  just  linear  correlations,  whereas  higher-order  effects  can  be 
relevant  and  further  limit  the  ability  of  the  system  to  lose  spatial  coherence.  We  plan  to  extend  our 
analysis  of  mode  amplitude  correlation  to  full  nonlinearity  by  considering  the  mutual  information  of  the 
relevant  variables.  If  we  consider  the  condition  ^  L  as  necessary  for  the  onset  of  spatiotemporal  chaos 
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Fig.  16.  Degree  of  spatial  coherence  Fix,  y)  of  the  electric  field  intensity  |F(4r,  y,  /)|'  -  (|F(jr.  y.  /)|'>  for  (a)  ^  «  2  and  (b)  «  4. 

The  X  and  y  coordinates  are  normalised  to  the  mode  maximum  spatial  extent  L.  F  has  been  evaluated  on  a  square  space  grid  of 
100  X  100  points  by  averaging  over  4096  time  configurations,  sampled  at  intervals  Ar  =  0.6. 


[9],  we  do  not  expect,  therefore,  optical  systems  of  our  type  to  display  this  kind  of  behaviour  until  (i) 
very  many  GL  modes  are  active  (ii)  out  of  these,  sufficiently  many  uncorrelated  modes  contribute  to 
the  dynamics  with  comparable  amplitudes  (i.e.,  spatiotemporal  chaos  could  emerge  only  for  selected 
parameter  values  among  those  yielding  irregular  time  dynamics).  We  point  out  again  that  linear 
uncorrelation  may  not  be  sufficient  for  the  fulfillment  of  condition  (ii).  A  very  efficient  way  of  testing 
condition  (ii)  is  the  study  of  the  time  fluctuations  of  modal  amplitudes  P{fpi)  introduced  in  section  4.1, 
from  which  one  can  rapidly  evaluate  the  contribution  to  the  dynamics  of  relevant  high-order  modes. 
Moreover,  if,  instead  of  GL  modes,  one  considers  spatial  Fourier  modes  (for  instance  by  Fourier- 
transforming  along  the  radial  direction  for  a  chosen  phase),  one  ex[>ects  the  time  amplitudes  of  the 
terms  having  wavelength  much  larger  than  the  correlation  length  ^  to  be  Gaussian-distributed  [9]. 
Obviously,  the  distributions  of  the  time  fluctuations  of  the  Fourier  coefficients  can  be  straightforwardly 
calculated  from  the  knowledge  of  the 


S.  Conciuding  remarks 

The  integration  of  the  dynamical  equations  with  a  number  of  modes  sufficiently  great  for  irregular 
spatial  structures  to  arise  cannot  be  performed,  unfortunately,  by  using  the  method  illustrated  in  section 
2.2,  because  of  the  exceedingly  large  CPU-time  that  would  be  required.  The  theoretical  study  of  the 
system’s  behaviour  in  this  limit  must  therefore  be  pursued  with  the  help  of  new,  more  sophisticated 
numerical  techniques. 
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Conversely,  the  experimental  observation  of  the  onset  of  spaliotemporal  chaos  in  passive  optical 
systems  (if  it  occurs  at  all)  seems  to  be,  at  least  in  principle,  achievable,  since  a  cavity  (whose  size  is 
usually  of  the  order  of  1  cm)  can  support  very  many  modes,  and  their  number  can  be  controlled  by 
adjusting  the  intracavity  aperture.  Special  consideration  deserve  the  already  mentioned  difficulties 
arising  from  the  huge  amount  of  data  that  should  be  recorded  in  order  to  characterise  an  extended 
system.  An  acquisition  apparatus  based  on  a  space  grid  of  photosensitive  devices,  for  instance,  would 
be  extremely  expensive,  and  would  also  have  such  a  high  information  output  that  the  data  channels  and 
the  storage  system  would  be  very  soon  saturated.  In  our  case,  at  least  for  a  first  set  of  experiments,  one 
can  focus  on  more  global  quantities.  As  discussed  in  section  4.1.  dimensions  and  entropies  can  be 
suitably  determined  from  the  total  output  intensity,  while  the  degree  of  spatial  coherence  F  can  be 
measured  by  using  series  of  photographic  snapshots.  In  order  to  test  the  onset  of  spaliotemporal  chaos, 
one  can  study  the  distributions  of  the  long-wavelength  Fourier  mode  coefficients  calculated  e.g. 
fast-Fourier  transforming  (in  space)  the  intensity  data  sampled  along  the  radial  direction  by  a  small 
number  of  photosensitive  devices.  Histograms  of  the  fluctuations  of  the  model  amplitudes  can  be 
obtained  already  after  suitably  short  observations.  The  experimenter  could  vary  the  system  parameters 
until  these  distributions  approach  a  Gaussian. 

In  section  2  we  have  modeled  the  dynamical  behaviour  of  an  absorbing  nonlinear  optical  medium  in  a 
ring  cavity  by  means  of  a  set  of  equations  in  which  the  slowly  varying  envelope  of  the  electric  field  is 
described  by  frequency-degenerate  Gauss- Laguerre  modes.  A  procedure  for  their  numerical  integra¬ 
tion  has  also  been  illustrated.  In  suitable  conditions,  these  equations  admit  stationary  solutions.  We 
have  shown  that  the  steady-state  configuration  may  correspond  to  a  single  or  to  many  GL  modes.  In  the 
former  situation  we  have  derived  the  analytic  expression  of  the  solution  for  some  simple  cases  and 
performed  a  linear  stability  analysis.  Varying  the  control  parameters  the  system  experiences  different 
dynamical  regimes,  some  of  which  have  been  illustrated  in  section  3.  In  general,  the  symmetry 
properties  of  the  solutions  differ  from  those  of  the  empty  cavity,  and  the  time  behaviour  may  become 
irregular.  An  extensive  study  of  such  a  situation  has  been  presented  in  section  4.  By  measuring 
dimension  and  entropy,  we  have  proved  that  the  asymptotic  motion  of  the  system  takes  place  on  a 
low-dimensional  strange  attractor.  Evidence  for  the  existence  of  systematic  non-decaying  linear  time 
correlations  among  GL  modal  amplitudes  has  been  provided,  thus  clarifying  the  relation  between 
mode  excitation  and  relevant  active  degrees  of  freedom.  The  measurement  of  the  degree  of  spatial 
coherence  F,  presented  in  section  4.2,  has  further  circumstantiated  the  role  of  this  relation.  Finally,  we 
have  discussed  the  conditions  for  the  onset  of  spaliotemporal  chaos,  coming  to  the  conclusion  that  this 
kind  of  behaviour  cannot  be  displayed  by  the  optical  systems  we  have  modeled  until  very  many  (i.e., 
hundreds  of)  GL  modes  are  active.  Moreover,  sufficiently  many  of  these  modes  must  display  decaying 
correlations  and  contribute  to  the  dynamics  with  comparable  amplitudes.  The  influence  of  higher  order 
correlations  remains  unexplored  and  may  constitute  a  further  obstruction  to  the  onset  of  spaliotemporal 
chaos. 
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The  general  features  of  the  onset  of  spatiotemptoral  chaos  observed  in  several  physical  systems  are  reviewed  with  special 
emphasis  on  the  statistical  properties.  The  possibility  of  constructing  a  thermodynamics  of  spatiotemporal  chaos  is 
discussed  by  considering  an  experiment  on  thermal  convection.  It  is  shown  that,  by  analogy  with  a  system  near  thermal 
equilibrium,  it  is  possible  to  define  two  quantities,  which  behave  like  an  energy  and  an  entropy  and  are  related  by  standard 
thermodynamic  relationships.  The  fluctuation-dissipation  is  also  verified. 


1.  Introduction 

The  transition  to  chaos  in  spatially  extended 
systems  is  now  widely  investigated  in  many 
natural  phenomena  such  as  hydrodynamic  [1-6] 
and  optical  instabilities  [7,8],  flames  [9],  chemi¬ 
cal  reactions  [10],  pattern  formation  in  biology 
[11]  and  information  storage  in  the  brain  [11]. 
There  are  also  technical  applications  in  which  the 
analysis  of  the  transition  to  spatiotemporal  chaos 
is  important  in  order  to  understand  the  stability 
of  devices,  for  example,  Josephson  junction  and 
laser  diode  arrays  [12].  Theoretically,  the  general 
properties  of  this  transition  are  studied  in  simple 
mathematical  models  such  as  coupled  maps  [IS¬ 
IS]'^',  partial  differential  equations  [14, 16-18] 
and  cellular  automata  [19-21].  The  reason  for 
this  interest  is  to  attempt  to  understand  the 
complex  space-time  evolution  of  the  above- 
mentioned  systems,  using  concepts  of  either 


*'  See  also  the  references  in  [18b]. 

'  To  whom  all  correspondence  should  be  addressed. 


statistical  mechanics  or  low-dimensional  dynami¬ 
cal  systems,  which  have  been  extensively  studied 
in  the  last  decade  [22]. 

It  is  important  to  point  out  immediately  the 
differences  between  low  dimensional  chaos, 
spatiotemporal  chaos  and  fully  developed  turbu¬ 
lence.  In  order  to  make  such  a  distinction  we 
follow  Hohenberg  and  Shraiman  [23],  who  made 
this  point  very  clear  by  deflning  three  charac¬ 
teristic  length  scales:  the  dissipation  length 
the  excitation  length  and  the  correlation 
length  The  ratios  of  these  lengths  with  respect 
to  each  other  and  to  the  characteristic  system 
size  L  determine  the  state  of  the  system.  The 
dissipation  length  is  the  minimum  length  below 
which  all  the  modes  are  damped.  The  excitation 
length  /g  is  the  characteristic  length  in  which 
energy  is  injected  into  the  system.  For  small 
systems  L  =  and,  for  moderate  values  of  the 
control  parameter,  Then  only  a  few 

modes  are  excited  and  the  dynamics  can  be 
explained  by  the  interaction  of  a  small  number  of 
modes.  In  contrast,  for  a  large  system,  that  is, 
when  L,  one  needs  to  introduce  the  correla- 
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tion  length  ^  which  can  be  defined  in  terms  of  the 
correlation  function  C(r,  /).  In  many  cases  this 
function  decays  exponentially  like  e\p(—r/^) 
thus  defining  ^  in  a  natural  way.  When  ^  —  L  the 
system  is  coherent  in  space  but  may  evolve  cha¬ 
otically  or  regularly,  that  is,  it  behaves  like  a 
small  dynamical  system.  In  these  regimes 
dynamical  systems  may  present  interesting  spa¬ 
tial  features  such  as  mode  competition,  travelling 
waves  and  localized  oscillations,  that  do  not 
destroy  the  spatial  order.  When  the  correlation 
length  L  the  dynamical  behavior  is  incoher¬ 
ent  in  space  and  the  system  may  be  chaotic  in 
both  space  and  time.  This  regime,  which  occurs 
for  L  >  /g  =  /p,  is  the  regime  of  spatiotemporal 
chaos  or  weak  turbulence  and  usually  corre¬ 
sponds  to  the  evolution  of  coherent  structures  of 
the  size  of  the  correlation  length.  This  regime 
has  to  be  distinguished  from  fully  developed 
turbulence  which  implies  an  energy  cascade,  that 
is,  Fourier  spectra  with  power  law  decay  [24, 25], 
which  does  not  necessarily  appear  in  spatiotem¬ 
poral  chaos. 

One  of  the  main  goals  in  the  research  of 
spatiotemporal  chaos  is  to  understand  properties 
that  do  not  depend  on  the  particular  system 
considered.  For  example,  several  experimental 
[1,3,4]  and  numerical  works  (13a,  14]  have 
shown  that  the  transition  to  spatiotemporal 
chaos  has  several  properties  of  a  second  order 
phase  transition,  although  the  details  of  the  tran¬ 
sition  may  depend  on  the  specific  system.  For 
example,  in  surface  waves  [1]  (two  spatial  dimen¬ 
sions  plus  time)  it  is  reminiscent  of  transition  in 
the  orientational  order.  When  the  system  can  be 
considered  almost  one  dimensional,  the  transi¬ 
tion  is  dominated  by  spatiotemporal  intermitten- 
cy  [2,  3,  4,  13a,  14, 17],  whose  dynamics  presents 
a  mixture  of  ordered  and  disordered  regions  that 
are  competing,  as  described  in  sections  2  and  3. 
The  space-time  evolution  in  the  above  men¬ 
tioned  works  was  always  characterised  in  terms 
of  local  variables  whose  dynamics  can  be  rather 
complex.  However,  it  is  also  important  to  see  if 
global  variables  of  the  system  can  be  useful  in 


describing  spatiotemporal  chaos,  or,  in  other 
words,  if  one  could  expect  to  find  simpler  statisti¬ 
cal  properties  in  those  variables  than  in  the  local 
ones  and  eventually  to  construct  a  statistical 
mechanics  of  spatiotemporal  chaos  [13b.  23,  26]. 
Furthermore,  the  dynamics  of  global  variables 
may  be  related  to  some  local  dynamics  as  was 
observed  in  a  recent  experiment  [5]  on  the  Fara¬ 
day  instability  (surface  waves  induced  excited 
parametrically).  Specifically  it  has  been  shown 
that  the  fluctuations  of  the  driving  acceleration, 
which  is  a  global  parameter,  are  strongly  corre¬ 
lated  with  the  dynamics  of  defects  of  the  surface 
wave  pattern. 

The  first  step  toward  the  construction  of  a 
statistical  mechanics  of  spatiotemporal  chaos  is 
the  determination  of  the  distributions  of  the  local 
fluctuations  and  those  of  the  Fourier  mode  am¬ 
plitudes.  This  has  been  applied  to  the  solutions 
of  the  Kuramoto-Sivashinsky  (KS)  equation  [27- 
29].  Specifically  Pumir  [27]  has  studied  the  fluc¬ 
tuations  in  real  and  Fourier  space  for  the  solu¬ 
tions  of  KS.  Yakhot  [28]  and  Zalesky  [29]  tried 
to  explain  the  dynamics  of  long  wavelength 
Fourier  modes  as  driven  by  a  stochastic  noise 
produced  by  the  short  wavelength  Fourier 
modes. 

In  addition,  Kaneko  [13b]  has  extensively  in¬ 
vestigated  the  thermodynamic  properties  of 
spatiotemporal  chaos  of  coupled  maps.  He  main¬ 
ly  studied  the  scaling  of  the  Lyapunov  dimension 
and  Kolmogorov-Sinai  entropy  as  functions  of 
the  size  of  the  subspace  used  to  measure.  He 
found  that  these  two  quantities  scale  linearly 
with  size,  thus  indicating  that  they  can  be  consid¬ 
ered  extensive  thermodynamic  variables.  He  also 
suggested  the  use  of  these  two  quantities  to 
characterize  space-time  chaos  in  experimental 
systems,  but  the  well  known  difficulty  of  measur¬ 
ing  Lyapunov  exponents  in  systems  that  have 
dimension  greater  than  5  makes  this  test  almost 
impossible  [30]. 

The  statistical  properties  of  space-time  chaos 
have  been  also  measured  in  several  experiments 
on  surface  waves  [la],  Rayleigh-Benard  convec- 


M.  Caponeri,  S.  Ciliberto  /  Thermodynamic  aspects  of  spatiotemporal  chaos 


367 


tion  [26]  and  in  an  optical  device  [7],  In  all  of 
these  experiments  it  has  been  observed  that  the 
amplitude  of  the  long  wavelength  Fourier  modes 
has  Gaussian  statistics  although  the  local 
dynamics  may  have  strong  deviations  from  a 
Gaussian.  This  rather  strange  result,  that  we  will 
discuss  in  section  4,  is  due  to  the  fact  that  long 
wavelength  Fourier  modes  are  coarse  grained 
variables  of  the  system.  Therefore,  they  imply  an 
average  over  many  correlation  lengths,  and  thus 
the  central  limit  theorem  ensures  that  the  am¬ 
plitude  of  long  wavelength  Fourier  modes  have  a 
Gaussian  distribution  even  though  the  local 
dynamics  does  not  [23,  T  ].  In  the  above- 
mentioned  experiments  [26]  on  Rayleigh- 
Benard  convection  an  attempt  has  also  been 
made  to  construct  a  thermodynamics  by  intro¬ 
ducing  some  global  quantities,  related  by  simple 
thermodynamic  relationships,  that  are  able  to 
characterize  space-time  chaos  and  are  easily 
measurable  in  experiments. 

Another  important  problem  that  has  to  be 
solved  in  the  research  of  spatiotemporal  chaos  is 
the  measure  of  at  least  the  order  of  magnitude  of 
the  number  of  degrees  of  freedom  involved  in 
the  dynamics.  Clearly  the  number  of  positive 
Lyapunov  exponents  wilt  be  useful.  However,  as 
we  mentioned  before,  in  experiments  the  calcu¬ 
lation  of  these  exponents  is  impossible  once  the 
number  of  positive  ones  is  greater  than  4.  Thus 
other  methods  have  been  tested.  A  very  promis¬ 
ing  one  is  is  the  Karhunen-Loeve  decomposition 
that  has  been  applied  to  experimental  [32,33] 
and  numerical  data  sets  [32,34,35];  it  can  be 
very  useful  not  only  in  estimating  the  number  of 
the  main  degrees  of  freedom  involved  in  the 
dynamics  [32],  but  also  in  constructing  models  of 
the  system  under  study  [33]. 

In  this  review,  in  order  to  describe  in  more 
detail  all  of  the  above-mentioned  aspects  of  the 
research  on  space-time  chaos,  we  consider  a 
specific  experiment  [3,  26,  32]  in  which  many  of 
those  aspects  have  been  analysed.  We  will  also 
try  to  compare  the  results  of  this  experiment 
with  those  of  other  experiments  and  of  numerical 


simulations.  The  experiment  considered  here  is 
on  thermal  convection  in  a  horizontal  fluid  layer 
heated  from  below,  that  is  Rayleigh-Benard 
convection.  This  fluid  instability  can  be  viewed 
as  a  general  example  because  as  a  function  of  the 
control  parameter  and  of  boundary  conditions,  it 
presents  a  periodic  stationary  pattern,  low  di¬ 
mensional  chaos,  spatiotemporal  chaos  and  tur¬ 
bulence. 

The  paper  is  organized  as  follows;  in  section  2 
we  recall  the  main  features  of  Rayleigh-Benard 
convection  and  those  of  our  experiment.  We  also 
describe  the  different  states  observed  as  a  func¬ 
tion  of  the  control  parameter.  In  section  3  we 
analyse  the  transition  to  spatiotemporal  chaos  in 
terms  of  local  variables,  showing  that  this  transi¬ 
tion  presents  features  of  a  second  order  phase 
transition.  In  section  4  the  possibilities  of  con¬ 
structing  a  thermodynamics  in  the  spatiotempor- 
ally  chaotic  regimes  is  discussed,  starting  with 
the  definition  of  a  temperature  of  the  system.  We 
also  show  that  if  a  thermodynamic  formalism  is 
applied  the  transition  to  space-time  chaos  still 
presents  features  of  a  second  order  phase  transi¬ 
tion  in  the  global  variables.  In  section  5  the 
application  of  the  Karhunen-Loeve  decomposi¬ 
tion  in  estimating  the  number  of  degrees  of 
freedom  on  experimental  and  numerical  data 
sets  is  described.  It  is  shown  that  this  method 
seems  to  give  a  number  close  to  the  number  of 
positive  Lyapunov  exponents.  Finally,  in  section 
6  a  general  discussion  and  conclusions  are  pre¬ 
sented. 


2.  A  specific  system 

2.1.  Rayleigh-Benard  convection  in  annular 
geometry 

The  features  of  Rayleigh-Benard  convection 
may  be  found  in  standard  texts  and  review  pap¬ 
ers  [36],  thus  we  recall  here  only  the  main  ones. 
Let  us  consider  a  fluid  layer  confined  between 
two  horizontal  solid  plates  and  heated  from 
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below.  The  most  relevant  parameter  of  this  in¬ 
stability  is  the  Rayleigh  number  R  =  Pg  AT  d^/ 
vx,  where  )8  is  the  volumetric  expansion  coeffi¬ 
cient,  g  the  acceleration  of  gravity,  v  the 
kinematic  viscosity,  x  fhe  thermal  diffusion  co¬ 
efficient,  d  the  depth  of  the  layer  and  AT  the 
difference  of  temperature  between  the  two 
horizontal  plates.  When  Ra  exceeds  the 
threshold  value  Rc  steady  convection  arises,  pro¬ 
ducing  a  pattern  of  parallel  rolls  with  a  well 
defined  wavenumber  k.  The  rolls  are  parallel  to 
the  shortest  side  of  the  cell  containing  the  fluid. 
The  values  of  Rc  and  k  depend  on  the  size  of  the 
cell  and  on  the  boundary  conditions.  For  exam¬ 
ple,  in  the  case  of  an  infinite  layer  and  perfect 
conducting  plates  Rc  =  1708  and  k^  =  3.ll/d. 
Thus,  from  an  experimental  point  of  view,  it  is 
very  important  to  define  two  other  parameters, 
namely  the  aspect  ratios  =  L^/d  and  =  LJ 
d,  where  and  are  the  two  horizontal 
lengths  of  the  cell.  The  time  dependent  regimes 
of  Rayleigh-Benard  convection  observed  at 
Ra^Rc  are  strongly  influenced  by  the  aspect 
ratios  and  also  by  the  Prandtl  number  Pr  =  vix- 
Indeed,  in  the  experiments  in  which  the  transi¬ 
tion  to  low  dimensional  chaos  has  been  studied 
[37],  r  was  of  the  order  of  litlk.  As  we  men¬ 
tioned  in  section  1,  this  corresponds  to  the  corre¬ 
lation  length  of  the  system  being  of  the  order  of 
the  system  size,  i.e.  ^  —  L. 

In  the  experiment  that  we  describe  in  this 
paper  the  cell  containing  the  working  fluid  is 
annular.  Indeed,  with  this  configuration  and  a 
suitable  choice  of  the  radial  aspect  ratio,  it  is 
possible  to  construct  a  pattern  that  is  almost  a 
one  dimensional  chain  of  radial  rolls  (roll  axis 
along  radial  directions,  see  also  fig.  1  with 
periodic  boundary  conditions.  These  features  of 
the  spatial  pattern  are  very  useful  in  order  to 
compare  the  results  of  our  experiment  with  those 
obtained  by  the  mathematical  models  mentioned 
in  section  1.  The  inner  and  outer  diameters  of 
the  cell  are  6  and  8  cm  respectively,  whereas  the 
depth  d  of  the  layer  is  1  cm.  Thus  the  azimuthal 
aspect  ratio,  measured  on  the  circle  of  radius  of 


Fig.  1.  Shadowgraphs  of  typical  spatial  patterns.  White  and 
dark  regions  correspond  to  cold  and  hot  currents  respective¬ 
ly.  (a)  Stationary  spatial  pattern  at  t/  =  100;  (b)  Snapshot  of 
the  spatial  pattern  at  tj  =  190  in  a  time  dependent  biperiodic 
regime;  (c)  Snapshot  of  the  spatial  patterns  at  tj  =  230  in  a 
spatiotemporal  intermittent  regime.  Notice  the  simultaneous 
presence  of  ordered  and  disordered  regions  in  different  parts 
of  the  annulus. 

mean  diameter,  is  22,  and  the  radial  one  is  1. 
The  fluid  is  silicone  oil  with  a  Prandtl  number  of 
30  and  the  critical  temperature  difference,  com¬ 
puted  for  an  infinite  layer,  is  AT^  =  0.06°C.  The 
different  states  of  the  system  will  be  labeled  with 
r)  =  ATIAT^.  The  experimental  details  have  been 
described  elsewhere  [38j.  The  set  up  provides 
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the  possibility  of  measuring  on  the  circle  of  mean 
diameter  the  two  components  of  the  thermal 
gradient  averaged  in  the  vertical  direction,  in  the 
polar  coordinate  reference  frame  r,  6.  Below  we 
describe  the  properties  of  only  the  component  of 
the  gradient  perpendicular  to  the  roll  axis,  be¬ 
cause  those  of  the  other  component  (the  radial 
one)  are  the  same.  The  azimuthal  component 
(made  dimensionless  by  dividing  it  by  ATJd)  is 
called  u(x,  t),  where  x  =  0/2tt  indicates  the  posi¬ 
tion  on  the  circumference  of  mean  diameter, 
thus  0  <  x  <  1.  The  function  u{x,  t)  is  sampled  at 
N  points  in  space.  In  time  dependent  regimes 
u(x,  t)  is  recorded  for  at  least  5000  times,  in 
order  to  have  sufficient  statistical  accuracy.  The 
time  series  length  was  about  500  times  the  mean 
period  of  oscillation  of  the  system.  N  was  equal 
to  128  in  several  runs  and  256  in  others.  How¬ 
ever,  no  sensitive  dependence  of  the  results  on  N 
has  been  noticed. 

2.2.  Spatial  patterns 

Analysing  the  fluid  behavior  as  a  function  of 
i7  =  Ar/A7’j,  we  observe  that  for  tj  around  1 
there  are  about  22  rolls.  This  number  increases 
with  7j  and  reaches  38  at  tj  around  200.  A 
detailed  analysis  of  the  wavenumber  selection 
process  has  been  reported  elsewhere  [38].  In  fig. 
la  we  show  the  shadowgraph  of  the  spatial  pat¬ 
tern  at  17  =  100.  Dark  regions  correspond  to  the 
hot  currents  rising  up  and  white  regions  to  the 
cold  ones,  going  down.  We  note  that  the  configu¬ 
ration  constrains  the  spatial  structure  to  an  al¬ 
most  one  dimensional  chain  of  rolls.. 

The  spatial  structure  remains  stationary  for 
17  <  164  where  a  subcritical  bifurcation  to  the 
time  dependent  regime  takes  place.  For  17  >  164 
the  time  evolution  is  chaotic  but,  reducing  77,  the 
system  presents  either  periodic  or  quasiperiodic 
oscillations,  and  at  77  =  149  it  is  again  stationary. 
In  the  range  149  <  77  <  200  the  time  dependence 
consists  of  rathei  localized  fluctuations  that 
slightly  modulate  the  convective  structure,  which 
maintains  its  periodicity.  This  is  clearly  seen  in 


fig.  lb  where  a  snapshot  of  the  spatial  structure 
at  7/ =  190  is  shown.  The  presence  of  hot  and 
cold  currents  transverse  to  the  main  set  of  rolls 
merits  a  special  comment.  Such  a  two  dimension¬ 
al  effect  certainly  influences  the  dynamics.  How¬ 
ever,  considering  that  the  ratio  between  the 
length  and  the  width  of  the  annulus  is  roughly 
22,  we  expect  that  the  system  can  be  considered 
almost  one  dimensional  for  what  concerns  the 
propagation  time  of  thermal  fluctuations  along 
the  circle,  because  the  two  time  scales  are  very 
well  separated.  Besides,  we  also  observe  that  the 
time  dependent  fluid  motion  is  still  well  corre¬ 
lated  along  the  radius. 

The  space-time  evolution  of  u(x.  t)  and  the 
corresponding  time  evolution  of  the  point  x  =  0 
at  77  =  164  are  shown  in  fig.  2a  and  fig.  2b.  In  fig. 
2b  we  clearly  see  that  the  time  evolution  is 


(b)  S.5 


Fig.  2.  (a)  Space-time  evolution  of  ii(x.t)  at  tj  =  164;  (b) 
Corresponding  time  evolution  of  the  point  x  =  0.  The  vertical 
scale  has  been  amplified  in  (b)  because  the  time  dependent 
modulation  slightly  perturbs  the  spatial  pattern  shown  in  (a), 
wheie  the  maximum  amplitude  is  roughly  4'’r/cm. 
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quasiperiodic.  However,  this  time  dependent 
modulation  is  hardly  seen  in  fig.  2a,  because  it 
slightly  perturbs  the  spatial  pattern  that  main¬ 
tains  its  original  periodic  structure.  Increasing  rj 
(<200)  the  time  evolution  becomes  chaotic  but 
there  is  still  spatial  order.  The  fractal  dimension 
and  the  orthogonal  decomposition  (see  section  5) 
indicate  that  the  number  of  degrees  of  freedom 
of  the  dynamics  is  around  3. 

At  higher  17  the  spatial  order  begins  to  be 
destroyed  because  of  the  appearance  of  bursts, 
detaching  from  the  boundary  layer.  This  spatio- 
temporally  intermittency  appears  at  17  >  200.  The 
snapshot  of  a  typical  spatial  pattern  at  17  =  230  is 
shown  in  fig.  Ic.  It  presents  several  domains 
where  the  spatial  periodicity  is  lost  (we  will  refer 
to  them  as  turbulent)  and  other  regions  (that  we 
call  laminar)  where  the  spatial  coherence  is  still 
maintained.  The  alternation  between  ordered 
and  disordered  regions  is  clearly  observed  in  the 
space-time  evolution  of  m(jc,  t)  at  17  =  216, 
shown  in  figs.  3a,  3b)  at  two  different  times.  We 
notice  that  for  1000  <  t  <  1040  there  are  strong 
oscillations  that  locally  destroy  the  spatial  order, 
whereas  for  1500 <  t<  1540  the  pattern  is  again 
very  regular.  In  the  following  sections  we  will 
describe  in  detail  the  statistical  properties  of  this 
spatiotemporal  intermittency,  but  we  want  to 
discuss  first  the  spatial  Fourier  spectra  in  the 
different  regimes  described  in  this  section. 

The  time  averaged  spatial  Fourier  spectra  at 
17  =  164,  17  =  216,  17  =  347  are  shown  in  figs.  4a, 
4b,  and  4c  respectively. 

The  spatial  spectrum  of  fig.  4a,  corresponding 
to  a  biperiodic  time  dependent  regime,  presents 
well  defined  peaks,  because  the  spatial  structure, 
although  modulated  in  time,  is  still  very  ordered. 
In  contrast  fig.  4b,  corresponding  to  a  value  of  -q 
that  is  very  close  to  the  threshold  for  spatiotem¬ 
poral  intermittency,  presents  a  broadened  third 
harmonic.  This  indicates  that  the  most  important 
length  scales  for  this  transition  are  the  shortest 
ones.  Finally,  in  fig.  4c  the  spectrum,  corre¬ 
sponding  to  a  value  of  17  far  above  the  transition 
point,  is  totally  broadened  because  the  spatial 


Fig.  3.  Space-time  evolutions  of  u(x.t)  at  7j  =  216  at  two 
different  time  intervals  of  40  s  each. 


order  has  been  destroyed.  Notice  the  exponen¬ 
tial  decay  at  high  k  and  the  flat  region  for  the 
small  ones.  These  features  are  rather  similar  to 
those  observed  in  the  turbulent  regimes  of  the 
Kuramoto-Sivashinsky  equation  [39].  The  evolu¬ 
tion  of  the  spatial  Fourier  spectra  as  a  function 
of  q  clearly  shows  the  increasing  disorder  of 
spatial  patterns  and  confirms  the  description 
made  in  the  introduction.  Furthermore,  the  cor- 


Fig.  4.  Spatial  power  spectra  at  different  values  of  ly.  (a) 
Tj  =  164;  (b)  17  =  216;  (c)  -q  =  348.  is  the  critical  wavenum¬ 
ber  which  is  about  3.11  cm  '. 
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relation  length  for  t]  =  Vc—  200,  where  the  sys¬ 
tem  presents  either  a  stationary  or  a  chaotic  time 
evolution,  is  very  close  to  1. 

3.  The  weak  turbulence 

3.1.  Reduction  to  a  binary  code 

As  we  discussed  in  the  previous  section,  for 
1)  >  200  the  space-time  evolution  of  u(x,t)  shows 
that  in  the  turbulent  domains  the  time  evolution 
is  characterised  by  the  appearance  of  large  oscil¬ 
latory  bursts  that  locally  destroy  the  spatial 
order.  By  contrast,  in  laminar  regions  the  oscilla¬ 
tions  remain  very  weak.  Thus  the  two  regions 
can  be  identified  by  measuring  the  local  peak  to 
peak  amplitude  for  a  time  interval  ht  com¬ 
parable  with  the  mean  period  of  the  oscillation, 
that  is, 

“pp(^,  0  =  niax[u(ac,  t)]  -  min[M(x,  t)]  (1) 

with  t<T<{t  +  ht).  Choosing  a  cutoff  a,  setting 
to  1  all  the  points  in  which  >  a  and  to  0  all 
the  other  points,  the  space-time  dynamics  is 
reduced  to  a  binary  code  in  which  1  stands  for 
“turbulent”  and  0  for  “laminar”.  As  an  example 
of  such  a  code  we  show  the  space-time  evolution 
of  u(x,  r)  at  T}  =  216,  in  fig.  5a),  and  tj  =  248  in 
fig.  5b),  the  black  and  white  regions  correspond¬ 
ing  to  turbulent  and  laminar  domains  respective¬ 
ly.  We  remark  that  the  qualitative  features  of 
these  pictures  are  rather  independent  of  the 
precise  value  of  the  cutoff.  We  can  easily  verify 
that  the  code  catches  the  main  properties  of  the 
dynamics  by  comparing  fig.  5a  with  figs.  3a  and 
3b.  Indeed,  we  clearly  see  that  the  most  oscillat¬ 
ing  and  disordered  regions  of  fig.  3  correspond  to 
black  points  in  fig.  5a,  whereas  ordered  and  not 
oscillating  regions  are  represented  by  white 
points. 

At  17  =  216  (fig.  5a)  a  wide  laminar  region 
surrounds  completely  the  turbulent  patches  that 
remain  localized  in  space,  after  their  appearance. 


Fig.  5.  Binary  representation,  at  a  =  1.5°C/cm,  of  the  space- 
time  evolution  of  u(x.  t)  at  (a)  17  =  216  and  (b)  tj  =  248.  The 
dark  and  white  areas  correspond  to  turbulent  and  laminar 
domains  respectively. 

Furthermore,  the  nucleation  of  a  turbulent  do¬ 
main  has  no  relationship  with  the  relaxation  of 
another  one.  In  contrast,  at  tj  =  248  (fig.  5b),  the 
turbulent  regions  migrate  and  slowly  invade  the 
laminar  ones.  This  last  regime  that  sets  for  tj  > 
245  is  very  similar  to  those  obtained  in  theoreti¬ 
cal  models  [13-14].  The  change  from  the  regime 
of  fig.  5a  to  that  of  fig.  5b  is  reminiscent  of  a 
percolation  [40].  Indeed,  percolation  has  been 
proposed  as  one  of  the  possible  mechanisms  for 
the  transition  to  spatiotemporal  intermittency. 

3.2.  Statistical  distribution  of  ordered  regions 

Following  a  method  also  used  in  numerical 
models  [13, 14],  we  quantitatively  characterize 
such  a  behaviour  by  computing,  over  a  time 
interval  of  10'*  s  (this  time  corresponds  to  rough¬ 
ly  10^  characteristic  times  of  the  system),  the 
distribution  P{x)  of  the  laminar  domains  of 
length  X.  For  tj  <  248  P(x)  decays  with  a  power 
law.  The  exponent  does  not  depend  within  our 
accuracy,  either  on  a  or  on  tj.  Its  average  value 
is  =  1.9  ±  0.1.  On  the  other  hand,  for  tj  >  248, 
the  decay  of  P{x)  for  x  >  0. 1  is  exponential  with 
a  characteristic  length  1/m.  The  existence  of  two 
different  regimes  is  clearly  seen  in  fig.  6a.  6b 
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Fig.  6.  Distribution  P{x)  of  the  laminar  regions  of  length  jt: 
(a)  ri  =  24l,  algebraic  decay  with  exponent  1.9;  (b)  tj  =  310 
and  a  =  1.6°C/cm,  exponential  decay  with  a  characteristic 
length  II m  =  0.10.  The  solid  lines  are  obtained  from  eq.  (4). 

which  display  P(jc)  versus  x  at  rf  =  241  and  tj  = 
310.  Looking  at  fig.  6a  we  clearly  see  that  the 
decay  of  P{x)  begins  for  a  length  scale  that  is 
smaller  than  the  roll  size.  This  rather  strange 
result  has  an  explanation,  because,  as  we  re¬ 
marked  in  section  2b,  the  main  energy  contribu¬ 
tion  to  the  time  dependent  regimes  comes  from 
the  high  spatial  frequencies. 

We  find  that  the  dependence  of  m  on  17  is  the 
following: 

m(a,  17)  =  moirj)  exp(-a/ao)  (2) 

with  tto  =  (0.87  ±  0.06)°C/cm  independent  of  17. 
The  dependence  of  mo  versus  77  is  reported  in  fig. 
7).  The  linear  best  fit  for  77  >  246  of  the  points  of 
fig.  7  gives  the  following  result: 

with  77,  =  247  ±  1  and  m,  =  117  ±  2.  This  equa¬ 
tion  shows  the  existence  of  a  well  defined 


Fig.  7.  Dependence  of  ml  on  t;.  The  different  symbols 
pertain  to  different  sets  of  measurements  done  either  increas¬ 
ing  or  decreasing  17.  The  solid  line  is  obtained  from  eq.  (3). 

threshold  77^.  for  the  appearance  of  an  exponen¬ 
tial  decay  in  P{x).  We  also  see  that  the  charac¬ 
teristic  length  1/mo  diverges  at  77  =  77^.  In  the 
range  200  <  77  <  400,  P{x)  is  very  well  approxi¬ 
mated  by  the  following  equation: 

P(j:)  =  (-4x"'' +  fi)exp[-m(a,77)x] ,  (4) 

where  m(a,  77)  is  given  by  (2),  fx  has  the  previ¬ 
ously  determined  values  and  A,  5  are  free  pa¬ 
rameters  that  can  be  easily  determined.  A  fit  to 
our  experimental  P(x),  in  the  range  0.4°C/ 
cm  <  a  <  3°C/cm,  yields  A  =  10,  B  =  4x  lO’  for 
77  >  77^  and  5  =  0  for  77  <  77^. 

The  features  of  P{x)  displayed  by  eqs.  (3),  (4) 
are  typical  of  phase  transitions.  Since  the  transi¬ 
tions  point  77,  is  very  close  to  the  point  where  the 
behaviour  like  that  of  fig.  5b  sets  in,  we  conclude 
that  the  transition  to  this  behaviour  may  be  a 
phase  transition.  The  main  features  of  P{x)  and 
m  for  77  >77,  qualitatively  agree  with  those  ob¬ 
tained  in  coupled  maps  [13-15],  partial  differen¬ 
tial  equations  [14, 17],  and  in  a  phenomeno¬ 
logical  cellular  automata  model  [19]  of 
spatiotemporally  intermittent  regimes. 

The  presence  of  a  power  law  decay  of  P{x)  for 
77^  <  77  <  77,  may  be  due  to  finite  size  effects. 
Indeed,  a  cellular  automaton  model  of  this  tran¬ 
sition  [19]  presents  the  same  features  when  the 
number  of  cells  is  reduced. 

Recently,  another  method  has  been  suggested 
to  identify  the  characteristic  length  involved  in 
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the  dynamics  [41],  It  is  based  on  the  comparison 
of  the  root  mean  square  value  of  the  field  with 
that  of  its  derivatives.  Therefore,  there  is  no 
need  to  define  a  threshold  as  we  have  done. 
However,  we  have  not  checked  this  method  on 
our  data  and  we  are  not  aware  of  other  papers 
on  the  subject. 

4.  Thermodynamics  of  space-lime  chaos 

In  the  previous  sections  we  have  analysed  the 
transition  to  spatiotemporal  chaos  in  terms  of 
local  variables.  We  now  want  to  discuss  the 
behaviour  of  global  variables  of  the  system. 

4.1.  Statistical  properties  of  fluctuations 

We  are  interested  in  knowing  the  statistical 
properties  of  the  fluctuations  W(x,  t)  =  u(x,  t)  - 
(m(jc,  t))  ((  )  means  time  average),  of  their  spa¬ 
tial  Fourier  transform  W{k,  t),  of  the  energy  E(t) 
and  of  a  suitably  defined  entropy  S{t).  The  ener¬ 
gy  is  defined  in  the  following  way: 

A/y 

Eit,N^)=2u\x,,t),  (5) 

1  =  0 

with  2<  N^<  N  where  N  is  the  total  number  of 
spatial  points.  The  total  energy  is  E{t)  =  E(t,  N). 
The  dependence  of  the  energy  on  shows  how 
the  root  mean  square  (r.m.s.)  value,  AE{N.,),  of 
the  energy  fluctuations  scales  as  a  function  of  the 
volume  of  integration  N^. 

The  spectral  entropy  [42]  is  defined  in  the 
following  way: 

ait)  =  —  S  ^,(0  log(4'*(0)  (6) 

where  =  [{/(k,  r)|^/£'(0,  5o  =  log(N/2)  is 
the  equipartition  value  of  5(0  and  N/2  the  total 
number  of  Fourier  modes.  The  parameter  o-  is  1 
at  the  equipartition  and  0  when  only  one  mode  is 
excited.  It  is  important  to  stress  that  E(r)  and 


ait)  are  not  exactly  an  energy  and  an  entropy 
but  they  behave  like  these  two  thermodynamic 
quantities.  We  want  also  to  stress  that  the  prop¬ 
erties  that  we  will  show  are  almost  independent 
of  the  specific  way  in  which  the  entropy  is  de¬ 
fined.  An  example  of  this  can  be  found  in  ref. 
[49],  where  an  entropy,  which  is  well  defined 
from  a  statistical  point  of  view,  has  been  used  to 
analyse  our  data,  showing  a  good  agreement 
with  the  results  reported  here.  Our  entropy  is 
defined  in  Fourier  space  because,  as  we  will  see, 
the  Fourier  modes  present  more  interesting 
statistical  properties.  There  is  no  distinction  in 
defining  the  energy  either  in  real  or  in  Fourier 
space. 

We  now  analyse  the  statistical  properties  near 
the  transition  point  for  spatiotemporal  chaos 
In  fig.  8  we  show,  for  two  values  of  t;,  the 


Fig.  8.  Distributions  of  the  fluctuations  of  the  spatial  Fourier 
mode  amplitude  W^{k,  i)  (a),  (c)  and  the  fluctuations  in  real 
space  W(x,  t)  (b),  (d)  at  x  =  0.5  and  klk^  =  2  for  different 
values  of  Tj:  (a),  (b)  tj  =  164;  (c),  (d)  tj  =  300.  Dashed  lines 
correspond  to  Gaussian  fits.  Distributions  of  E  (e).  and  of  a 
(f),  at  77  =  164  solid  lines;  t;  =  219  dotted  lines;  77  =  248 
dashed  lines. 
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distributions  of  the  real  part  VV,  of 

W{k,  t)  at  klk^  =  2  (/c^  is  the  critical  wavenumber 
of  the  instability),  and  P^{W),  of  W(jc,  t),  at 
jc  =  0.5.  We  clearly  see  that  at  tj  =  164  (when  the 
time  evolution  is  biperiodic)  t)  and  W(j',  r) 
have  a  two-peaked  distribution.  For  V>Vs  (fig- 
8c),  P(W,)  tends  to  a  Gaussian  distribution;  in 
contrast  P{W)  (fig.  8d)  is  a  Gaussian.  The  fact 
that  the  Fourier  mode  amplitudes  have  Gaussian 
distributions,  whereas  the  local  dynamics  does 
not,  has  also  been  reported  in  ref.  lb  and  widely 
discussed  in  refs.  [23, 31].  The  reason  for  this 
effect  is  that  the  small  k  Fourier  modes  are 
coarse  grained  variables  of  the  system  because 
they  imply  on  average  over  many  correlation 
lengths  [23, 31].  In  figs.  8e,  8f  we  show,  for 
different  values  of  t/,  the  distributions  P(F)  and 
P{<t)  of  E{t)  and  a(t)  respectively.  We  observe 
that  above  the  transition  point  %  the  entropy 
fluctuations  are  clearly  enhanced  and  that  the 
average  value  of  a  grows  as  a  function  of  r\. 

Although  the  distributions  for  rj  >  assume  a 
characteristic  bell  shape,  they  are  not  Gaussians, 
because  a  significant  asymmetry  is  observed.  The 
existence  of  this  asymmetry  is  quite  reasonable 
because  the  distribution  of  a  nonlinear  function 
of  Gaussian  distributed  variables  (the  W  in  our 
case)  is  not,  in  general,  Gaussian  [43]. 

The  deviation  of  a  generic  distribution  P(v) 
from  a  Gaussian  may  be  studied  in  a  better  way 
by  computing 


=  (7) 

where  %{v  -  vj  is  the  moment  of  order  i  of  the 
variable  v  -  and  is  the  mean  value  of  v.  For 
a  Gaussian,  A/j  (the  skewness)  and  (the 
flatness)  are  equal  to  0  and  3  respectively.  These 
two  quantities  have  been  computed  for  P{W), 
P(W^),  P{E)  and  P{a)  at  different  values  of  17. 
The  results  for  P(W,)  are  shown  in  figs.  8a,  8b  as 
a  function  of  17  for  several  values  of  k  (the 
imaginary  part  behaves  in  the  same  way).  The 
statistical  accuracy  in  the  calculation  is  of  the 
order  of  ±0.3,  and  it  has  been  computed  using 


400 


400 


Fig.  9.  (a)  The  skewness  37,(1^,)  and  (b)  the  flatness 
of  the  distribution  of  IV,,  versus  77  for  different 
values  of  k:  O,  <:/<:,.  =  1;  ♦,  k/k^  =  l.5;  A.  klk^  =  l\  ▲. 
klk^  =  3.  (c)  M,(E)  and  (d)  M^{E)  versus  tj. 


standard  methods.  The  skewness  (fig.  9a)  is  al¬ 
ways  close  to  zero  because  the  distributions  of 
our  data  are  always  rather  symmetric  with  re¬ 
spect  to  the  mean.  In  contrast,  the  flatness, 
v.'hich  accounts  for  the  tails  of  the  distributions, 
changes  considerably  as  a  function  of  rj;  see  fig. 
9b.  M^(W^)  tends  to  3  for  almost  all  the  modes, 
for  Tj^Vs  confirming  the  transition  to  a  Gaussian 
distribution.  We  point  out  that  the  same  transi¬ 
tion  does  not  occur  in  M^(W)  for  all  the  spatial 
points,  indicating  that  the  local  dynamics  does 
not,  in  general,  have  a  Gaussian  distribution. 

In  fig.  9c  and  9d  we  show,  as  a  function  of  rj. 
Mj(E)  and  M^{E)  respectively.  We  observe  that 
in  this  case  does  not  reach  3  for  and 

is  different  from  zero,  confirming  the  exist¬ 
ence  of  the  strong  asymmetry  seen  in  fig.  8e). 
Furthermore,  the  fact  that  the  M^{E)  and  M^(E) 
are  constant  for  ’7>i7s  indicates  that  the  shape  of 
distribution  does  not  change  as  a  function  of  t?. 

We  now  analyse  the  behaviour,  as  a  function 
of  A/„,  of  the  r.m.s.  energy  fluctuations,  AE(N,  ). 
For  T7  <  T7,,  =  200  the  relative  fluctuations 
8£  =  A£(N„)/ E{N^)  do  not  follow  a  well  defined 
law  as  a  function  of  In  contrast,  for 
we  find  that  hE  decreases  as  a  function  of  as  a 
power  law  N'^,  in  the  range  2  <  (fig. 
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Fig.  10.  (a)  Dependence  on  N,  of  the  relative  energy  fluctua¬ 
tions  8£  for  different  values  of  ij;  □,  t/  =  219;  ■,  tj  =  233;  O, 
ij  =  247.5;  ♦,  t)  =  278;  A,  t/  =  310;  A,  ij  =  327.  (b)  Depen¬ 
dence  on  Tj  of  the  exponent  p.,  the  different  symbols  pertain 
to  different  runs. 


10a).  The  exponent  pl{t])  tends  asymptotically  to 
—  1/2  (fig.  10b).  The  value  of  /i  indicates  that 
above  tj,,  E{t,  N^)  behaves,  as  a  function  of  the 
number  of  points,  as  an  additive  thermodynamic 
quantity. 

4.2.  Fluctuation-dissipation  theorem 

Summarizing  the  results  up  to  this  point  we 
observe  that  the  distributions  of  the  Fourier 
mode  amplitudes  tend  to  a  Gaussian  distribution 
for  Tj>Tj,.  Furthermore,  8£  decreases  as  func¬ 
tion  of  the  integration  volume.  These  findings 
support  a  thermodynamical  description  of  the 
transition  to  spatiotemporal  chaos,  in  which  the 
Fourier  modes  may  be  considered  as  an  ensem¬ 
ble  of  non-interacting  degrees  of  freedom.  An 
important  question  is  how  a  “generalized  tem¬ 
perature”  of  the  system  may  be  defined  [23].  The 
main  difficulty  arises  from  the  fact  that  ^iW)  is 


not  constant  as  a  function  of  k  but  presents  a 
high-frequency  cutoff  as  in  all  hydrodynamic  sys¬ 
tems  [31]. 

This  problem  has  been  bypassed  by  Hohen- 
berg  and  Shraiman  [23],  who  suggested  using  the 
fluctuation-dissipation  theorem  (FDT)  [43]  to 
define  the  temperature  of  the  system.  An  exten¬ 
sion  of  this  theorem  has  recently  been  proposed 
by  Falcioni  et  al.  [44],  who  have  demonstrated 
that  it  can  also  be  successfully  used  in  dynamical 
systems  presenting  a  chaotic  time  evolution. 

The  application  of  the  fluctuation  dissipation 
theorem  implies  the  knowledge  of  the  linear 
response,  8(x,  /),  of  the  system,  which  can  be 
measured  by  perturbing  it  with  a  very  small 
signal.  From  a  practical  point  of  view  this  is  a 
very  difficult  task  even  in  numerical  systems  [44], 
because  a  very  small  signal  (the  response  to  the 
perturbation)  has  to  be  extracted  from  the  natur¬ 
al  fluctuations  of  u{x,  t),  that  are  much  bigger 
than  8(jc,  /).  Therefore,  the  error  in  the  calcula¬ 
tion  of  the  response  may  be  very  large. 

In  order  to  verify  the  approach  proposed  by 
Hohenberg  and  Shraiman,  we  perturbed  our  sys¬ 
tem  locally  with  a  heat  source  constructed  with  a 
small  resistor  having  a  negligible  thermal  re¬ 
sponse  time  and  a  size  smaller  than  one  half  the 
depth  of  the  layer.  The  resistor  was  heated 
periodically  by  electric  current  pulses,  whose 
frequency,  amplitude  and  duty  cycle  were 
changed  in  order  to  measure  5(x,  t)  at  many 
different  frequencies  and  amplitudes.  At  a  fixed 
TJ  we  sent  a  series  of  about  ten  pulses  while  we 
were  measuring  the  space-time  evolution  of  the 
system  Mp(x,  t)  =  u{x,  t)  -l-  5(j:,  t),  that  is,  the 
natural  fluctuation  plus  the  response  to  the  per¬ 
turbation.  The  time  0  of  these  series  coincides 
with  the  first  pulse.  We  repeated  this  operation 
at  least  100  times,  and  by  synchronously  averag¬ 
ing  the  different  time  series  we  obtained  the 
space-time  evolution  8(x,  t).  This  happens  be¬ 
cause  the  mean  of  w(jc,  0  is  zero  as  a  con¬ 
sequence  of  its  chaotic  time  evolution.  As  an 
example  we  show  in  fig.  11a  the  time  evolution 
of  S(x,  t)  produced  by  pulses  of  0. 1  s  with  a  duty 
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Fig.  11.  (a)  Space-time  evolution  of  the  perturbation  pro¬ 
duced  by  the  heater  at  >;  =  230;  the  arrow  indicates  the 
position  of  the  heater  producing  the  perturbation;  (b)  Time 
evolution  of  the  spatial  Fourier  spectrum  of  the  perturbation 
at  7]  =  230;  (c)  Spectrum  Se(  /)  of  the  energy  at  77  =  300. 
Stars  correspond  to  Rf{f)  =  I(f)lf  measured  at  different 
frequencies.  The  two  continuous  lines  are  the  best  fit  of 
5E(/)(l)andof/?E(/)  (2). 


cycle  of  50  s  at  17  =  230.  We  observe  the  existence 
of  a  propagation,  starting  from  the  point  where 
the  perturbation  is  produced  (arrow  in  the 
figure).  The  propagation  is  damped  before  it  can 
invade  all  the  structure.  In  fig.  11b  we  show  the 
time  evolution  of  the  spatial  Fourier  spectrum  of 
the  perturbation  of  fig.  11a.  Here  we  clearly  see 
an  inverse  cascade  of  energy  going  towards  small 
k  and  starting  from  wavevectors  corresponding 
to  the  characteristic  size  of  the  perturbation 
(length  of  the  heating  resistor).  It  is  interesting 
to  notice  that  the  maximum  of  the  Fourier  spec¬ 
trum  never  goes  below  the  length  scale  of  k^. 

The  FDT  has  been  verified  by  comparing  the 
spectrum  S^{f)  of  E{t),  directly  measured,  with 
that  computed  from  the  response  function. 
Specifically,  once  S(x,  t)  is  obtained,  we  compute 
the  corresponding  energy 

N 

Es{t)=I,S\x,,t)  (8) 

(=0 

and  its  Fourier  transform  is  divided  by  the  exci¬ 
tation  spectrum  in  order  to  have  a  normalized 
response.  The  imaginary  part  of  this  normalized 
Fourier  transform  is  /(/).  The  FDT  implies  that: 

S^U)  =  AlU)lf>  (9) 

where  /4  is  a  constant  (proportional  to  the  tem¬ 
perature).  In  fig.  11c  we  show  the  spectrum 
S^{f)  obtained  at  17  =  300.  The  stars  in  this 
figure  represent  R^if)  -  which  has  been 

translated  by  log  A  in  order  to  superimpose  it  on 
The  two  continuous  lines  (almost  indisting¬ 
uishable)  correspond  to  best  fits  of  S^{f)  and 
R^if).  The  good  agreement  between  the  com¬ 
puted  spectrum  and  the  measured  one  shows 
that  the  FDT  is  satisfied  within  experimental 
errors,  and  thus  a  temperature  of  the  system  can 
be  defined,  at  least  using  a  spatially  averaged 
variable  such  as  the  energy.  It  would  certainly  be 
more  interesting  if  we  could  show  that  FDT 
holds  also  for  long  wavelength  Fourier  modes. 
We  also  tried  this  measurement,  but  the  results 


M.  Caponeri,  S.  Ciliberto  I  Thermodynamic  aspects  of  spatiotemporal  chaos 


'ill 


were  dependent  on  the  model  that  we  used  to 
schematize  the  spatial  distribution  of  the  pertur¬ 
bation.  Thus  we  do  not  have  a  clear  check  of 
FDT  for  all  k.  Furthermore  the  above- 
mentioned  difficulties  of  measuring  the  response 
function  do  not  allow  us  to  use  FDT  to  follow 
how  the  temperature,  defined  by  eq.  (9), 
changes  as  a  function  of  the  control  parameter. 
Indeed,  the  spectrum  shown  in  fig.  11c  took 
almost  two  days  of  averaging  to  be  obtained.  So 
we  propose  here  an  approach  that  is  rather 
similar  to  that  of  [23],  but  it  uses  the  natural 
fluctuations  of  the  system. 

4.3.  Definition  of  a  temperature 

We  know  that  [43],  for  a  thermodynamic  sys¬ 
tem  at  constant  pressure  and  volume,  the  r.m.s. 
fluctuations  of  energy  and  entropy  are  propor¬ 
tional  to  and  respectively, 

where  Q  and  Cp  are  the  specific  heats  at  con¬ 
stant  volume  and  constant  pressure,  T  is  the 
temperature  of  the  system  and  /Cg  is  the 
Boltzmann  constant. 

To  check  these  points  we  show  in  fig.  12  the 
mean  values  of  E,  <j  and  the  r.m.s.  values 
A tr  of  their  fluctuations,  as  functions  of  tj.  We 
see  that  {E)  (fig.  12a)  and  {a)  (fig.  12b)  are 


monotonically  increasing  as  a  function  of  rj.  The 
behaviour  of  a  above  =  200  indicates  that  the 
power  spectrum  shape  does  not  change  as  a 
function  of  tj.  From  fig.  12d  we  immediately 
realise  that  Ao-  increases  by  a  considerable 
amount  near  tj^,  as  indeed  we  have  already 
observed  in  the  distributions  of  fig.  8e.  In  fig. 
12d,  we  also  notice  that  Aa  is  almost  constant 
above  tj^.  As  a  consequence,  we  can  make  the 
hypothesis  that  Cp  of  our  “thermodynamic  sys¬ 
tem”  is  constant  above  rj^.  Since  we  cannot 
distinguish  in  our  system  between  a  constant 
volume  and  a  constant  pressure  process,  we  as¬ 
sume  C  =  Cp  =  C^ .  Such  a  hypothesis  has  to  be 
verified  a  posteriori. 

Fig.  12c  shows  that  for  9>TJ„  AE  grows 
linearly  as  a  function  of  tj  (solid  line  in  fig.  12c). 
As  a  consequence,  the  ratio  {AE/Aa)  may  be 
considered  proportional  to  the  “generalized  tem¬ 
perature”  {f  =  rt))  of  the  system  for  tj>tJj 
where  the  Fourier  mode  amplitudes  have  a 
Gaussian  distribution.  From  the  data  we  obtain 
r  =  73±  1. 

In  order  to  demonstrate  that  our  definitions 
are  self-consistent,  we  construct  for  tj  >  tj,  =  248 
a  free  energy  F: 

F=  —  Crrj  In(rrj) -t- (o-p -I- C)rTj  ,  (10) 


Fig.  12.  Dependence  on  t;  of  the  mean  values  of  the  energy 
E  (a),  the  entropy  cr  (b),  and  of  the  r.m.s.  values  of  their 
fluctuations  A£  (c),  la  (d). 


where  (T-p  =  0.817  ±  0.005,  C  =  0.165  ±  0.005. 
From  this  free  energy  we  may  compute  (£), 
{(t)  and  C  as  a  function  of  tj,  via  appropriate 
thermodynamic  relationships  [43].  The  solid 
lines,  shown  in  Fig.  12a  and  12b,  are  the  result  of 
the  calculations,  and  are  in  agreement  with  the 
experimental  points.  This  verifies  all  the  hypoth¬ 
eses  made  to  define  the  “generalized  tempera¬ 
ture”  of  the  system. 

In  fig.  13a  we  show  the  behaviour  of  the  free 
energy  F  versus  tj.  We  see  that  F  is  a  decreasing 
and  rather  smooth  function  of  tj.  In  order  to  see 
if  this  function  behaves  like  a  free  energy,  we 
have  measured  the  time  evolution  of  F  when  the 
control  parameter  tj  is  suddenly  changed.  The 
result  of  one  measurement  is  shown  in  fig.  13b 
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Fig.  13.  (a)  Dependence  of  the  free  energy  F  as  a  function  of 
ri;  (b)  Transient  behaviour  of  F  when  rj  is  suddenly  changed 
from  330  to  4S2;  (c)  Time  evolution  of  F  in  the  steady  state  at 
Tj  =  415. 

where  F  is  shown  as  a  function  of  time  when  t/  is 
changed  from  330  to  452  at  time  0.  The  mean 
entropy  and  the  mean  energy  were  computed  on 
an  interval  of  time  of  about  100  sec.  We  see  that 
F  relaxes  to  the  new  value  without  strong  oscilla¬ 
tions,  whose  amplitude  is  similar  to  the  free 
energy  fluctuations  in  the  steady  state  shown  in 
fig-  13c.  This  kind  of  dynamical  behaviour  shows 


that  the  function  F  that  we  defined  behaves  like 
a  free  energy  also  from  a  dynamical  point  of 
view. 

4.4.  Extension  of  the  results  to  small  rj 

We  want  now  to  compare  these  results,  based 
on  the  analysis  of  global  variables  of  the  system, 
with  those  obtained  in  section  3  using  a  local 
analysis.  In  the  previous  section  we  have  seen 
that  using  the  characteristic  length  of  turbulent 
regions  as  an  order  parameter,  we  were  able  to 
identify  a  well  defined  transition  point  17^  for  the 
appearance  of  space-time  intermittency.  In  this 
section  we  have  seen  that  above  17,  the  system 
displays  thermodynamic  properties  and  that  a 
temperature  can  be  defined.  Now  there  is  the 
important  question  of  understanding  the  transi 
tion  between  a  state  below  17^  that  does  not 
display  thermodynamic  properties  to  a  state 
about  that  does  display  these  properties.  To 
the  best  of  our  knowledge  there  is  not  a  clear 
answer  to  this  problem.  Since  we  do  not  have 
our  own,  we  call  attention  to  some  other  rel¬ 
evant  experimental  facts  which  should  provide 
useful  information  for  a  more  complete  under¬ 
standing  of  the  phenomenon.  We  consider  the 
fluctuations  of  the  entropy  Act.  In  looking  at  fig. 
12  we  see  that,  close  to  the  transition  point  for 
t7  =  200.  Act  is  much  higher  than  for  any  other 
value  of  17.  If  we  force  our  thermodynamic  ana¬ 
logy,  the  divergence  of  Act  corresponds  to  a 
divergence  of  the  specific  heat  C  near  the  transi¬ 
tion  point,  which  is  a  typical  feature  of  second 
order  phase  transition.  To  see  if  our  thermo¬ 
dynamic  analogy  could  be  extended  to  a  thermo¬ 
dynamic  formalism  that  could  relate  the  different 
quantities  via  standard  thermodynamic  relation¬ 
ships,  even  below  77^  we  adopted  the  following 
procedure.  We  made  a  best  fit  of  {a}  versus  rj  in 
the  full  measurement  range.  The  fit  is  shown  as  a 
continuum  line  in  fig.  14a.  By  using  the  standard 
thermodynamic  relationship  C  =  (daldr))r)  one 
can  compute  C  versus  77  from  the  best  fit  of  (cr). 
Taking  into  account  the  value  of  previously 
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Fig.  14.  (a)  Spectral  entropy  a  versus  rj.  the  continuous  line 
is  a  best  fit  of  the  data;  (b)  Entropy  fluctuations  versus  the 
continuous  line  is  obtained  from  the  fit  of  a  via  standard 
thermodynamic  relationships. 

computed,  one  obtains  the  fluctuations  of  A tr  as 
function  of  17.  The  result  of  these  calculations  is 
the  continuous  line  in  fig.  14b  that  fits  rather  well 
the  divergence  of  entropy  fluctuations.  This  re¬ 
sult  indicates  that  an  extension  of  our  thermo¬ 
dynamic  formalism  is  reasonable,  although  there 
are  no  theoretical  justifications  to  do  that.  If  one 
believes  this  thermodynamic  analogy,  one  con¬ 
cludes  that  some  macroscopic  variables  present 
features  of  a  second  order  phase  transition  (di¬ 
vergence  of  the  specific  heat)  as  presented  by  the 
statistics  of  the  local  variables  used  in  section  3. 


5.  Estimating  the  number  of  “degrees  of 
freedom” 

In  this  section  we  want  to  analyse  the  number 
of  “degrees  of  freedom”  that  may  produce  the 
behaviour  described  in  previous  sections.  By 
“degrees  of  freedom”  we  mean  a  number  which 
is  close  to  the  Lyapunov  and  fractal  dimensions 
[22]. 


Howe\2r,  as  we  have  already  mentioned  in  the 
introduction,  the  direct  calculation  of  these  two 
quantities  on  experimental  signals  is  almost  im¬ 
possible  when  the  dimension  of  phase  space  is 
greater  than  5  [30,45].  To  overcome  this  prob¬ 
lem  we  have  applied  the  Karhunen-Loeve 
(K.L.)  decomposition,  which  has  recently  been 
applied  to  spatiotemporal  dynamics  [33-35]  and 
to  low-dimensional  chaos  [50]. 

Here  we  report  only  a  summary  of  the  results 
because  the  details  can  be  found  in  ref.  [32].  Let 
us  recall  very  briefly  what  K.L.  decomposition  is. 
If  we  have  a  field  /)  this  method  allows  us 
to  find  a  basis  of  orthonormal  functions  (x)  by 
solving  an  integral  equation  whose  kernel  is 
the  two-point  correlation  function  K{x,x')  = 
{w'(x,  t)  w{x’,  t))  ((  )  means  time  average).  The 
functions  %(x)  are  the  eigenfunctions  of  the 
integral  equation 

L 

j  Kix,x')%{x’)d’x  =  \„%{x),  (11) 

0 

whose  eigenvalues  are  A„.  Then  we  can  write 

X 

W{x,t)=  E  A„it)%{x)  (12) 

n-\ 

with  A„{t)  and  %ix)  that  satisfy  the  following 
conditions; 

L 

lj%ix)^„ix)dx  =  S„,„,  (13) 

0 

KS„.n,  =  {A„it)A„it)).  (14) 

Here  L  is  the  size  of  the  system,  8„  ^  is  the 
Kronecker  symbol,  A„  is  the  energy  of  the  mode 
n  and 

X 

E=2k  (15) 

n  =  1 

is  the  total  energy  of  the  system.  To  apply  the 
method  we  first  compute  the  two-point  correla¬ 
tion  function  K{x,x’)  and  then  eq.  (11)  is 
solved.  Finally,  once  the  %{x)  are  known  the 
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A„{t)  can  be  obtained  by  projecting  the  space¬ 
time  evolution  onto  this  basis,  using  eq.  (12)  and 
eq.  (13).  Here  the  field  iy(jc,  t)  to  be  decom¬ 
posed  using  the  K.L.  method  represents  the 
fluctuations  of  u{x,  t)  with  respect  to  the  time 
averaged  spatial  pattern  {u{x,t)).  We  have 
studied  how  the  energy  of  W(jc,  t)  is  distributed 
among  the  'f'„(x)  as  a  function  of  the  control 
parameter  77.  To  do  this  we  define  the  quantity 

R(ni)=2f,  (16) 

which  is  the  percentage  of  energy  contained  in 
the  first  m  modes.  The  quantity  R(m)  versus  m  is 
shown  in  fig.  15a  for  five  different  values  of  rj. 
We  clearly  see  that  the  initial  slope  of  R(m)  at 
m  =  1  decreases  as  a  function  of  17.  This  means 
that  the  number  of  modes  that  contain  a  certain 
percentage  of  the  total  energy  increases  as  a 
function  of  17.  In  figs.  14b  we  show  this  number, 
that  is,  the  minimum  value  M  of  m  for  which 
/?(m)>0.93,  as  a  function  of  77.  The  value  of 
93%  has  been  chosen  because  in  the  case  of 
periodic  oscillation  this  is  the  amount  of  the  total 
energy  contained  in  the  first  2  modes.  If  a  differ¬ 
ent  threshold  is  used,  M  can  change  by  1  or  2  but 
the  shape  of  the  curve  remains  the  same.  In 
looking  at  fig.  15a  we  see  that  M  is  less  than  10 
for  148  <  77  <  175  where  the  system  presents 
periodic,  biperiodic  and  weakly  chaotic  oscilla¬ 
tions,  while  retaining  spatial  coherence  (see  sec¬ 


tion  2.  Then  it  has  a  big  jump  at  r;  =  190  where 
the  system  presents  a  bursting  regime  that  leads 
to  spatiotemporal  intermittency  (see  section  2). 
Above  77  =  240  we  note  a  saturation  that  in  part 
is  due  to  the  limited  number  of  data  points  in 
space,  but  we  notice  that  the  behaviour  of  Af  as  a 
function  of  77  is  rather  similar  to  that  of  the 
spectral  entropy  computed  in  section  4  (see  fig. 
12). 

We  computed  the  maximum  number  M  that 
can  be  accurately  estimated  with  a  given  number 
of  samples  in  space.  This  check  has  been  done  by 
reducing  the  number  of  samples  in  our  data.  We 
find  that  M  remains  reliable  up  to  roughly  1/2  of 
the  maximum  number  of  data  points  that  are 
sampled  in  space.  We  also  checked  the  results  by 
increasing  the  number  of  spatial  samples  from 
128  to  256  and  we  found  that,  for  77  >240,  the 
value  of  significant  eigenfunctions  is  about  75, 
thus  the  correction  is  not  very  large. 

The  K.L.  method  converges  very  rapidly  to¬ 
wards  the  final  result  even  on  data  sets  for  which 
calculations  of  the  fractal  dimension  (f.d.)  or  the 
Lyapunov  exponents  give  a  wrong  result  because 
of  an  insufficient  number  of  points.  In  the  insert 
of  fig.  15b  we  show  an  expanded  view  of  the 
region  in  fig.  15b  with  140  <77<  165,  where  the 
system  presents  a  low  dimensional  chaotic  re¬ 
gime.  The  stars  represent  the  number  A/,  where¬ 
as  the  crosses  are  the  values  of  the  f.d.  for  the 
same  values  of  77.  The  fractal  dimension  has  been 
computed  using  the  method  of  Grassberger  and 
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Fig.  15.  (a)  Dependence  of  R(m)  versus  m  for  five  different  rj:  ★,•»}=  150;  □,  tj  =  160;  A,  17  =  168;  0.17  =  219;  +.77  =  310.  (b) 
Dependence  of  M  on  tj.  The  insert  is  an  expanded  view  of  the  interval  145  <  17  <  165  (not  all  the  experimental  points  are  shown). 
Dots  correspond  to  Af  whereas  crosses  represent  the  values  of  the  fractal  dimension. 
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Procaccia  [46].  The  phase  space  has  been  con¬ 
structed  using  64  time  series  simultaneously  re¬ 
corded  at  uniformly  spaced  spatial  points.  This 
method  is  clearly  beicer  than  using  delayed  coor¬ 
dinates  because  it  eliminates  the  ambiguity  of  the 
choice  of  the  delay  time  [Al]*^.  When  looking  at 
the  insert  of  fig.  3b,  we  see  that  there  is  a 
remarkable  agreement  between  M  and  f.d.  if  the 
latter  is  less  than  5.  Specifically  M  is  equal  to  the 
next  integer  larger  than  f.d.  This  point  has  been 
clearly  discussed  also  in  ref.  [50)].  To  see  if  the 
first  M  eigenfunctions  form  a  sufficient  basis  to 
describe  the  space-time  dynamics  of  the  system 
we  computed  f.d.  using  /4„(/)  with  1  <  w  <  M  to 
reconstruct  the  phase  space.  The  results  are  in 
perfect  agreement  with  those  computed  using  64 
time  series. 

In  contrast,  when  M  is  larger  than  10,  al¬ 
though  we  observed  an  increase  of  f.d.,  we  were 
not  able  to  estimate  the  fractal  dimension  reli¬ 
ably.  This  quantity  was  greater  than  6  for  rj  > 
170  but  the  error  was  more  than  50%  even  when 
using  16  000  data  points  for  each  of  the  64  time 
series.  The  reason  for  the  large  error  is  the 
appearance  of  the  intermittent  behaviour,  which 
does  not  allow  enough  resolution  (data  points) 
during  the  intermittent  bursts.  So  we  can  only 
say  that  the  f.d.  is  probably  greater  than  6  but,  in 
this  case,  we  cannot  make  a  comparison  with  M. 
This  comparison  can  be  made  only  on  numerical¬ 
ly  generated  dynamics  whose  number  of  degrees 
of  freedom  are  exactly  known.  This  has  been 
done  in  the  Kolmogorov-Spiegel-Sivashinsky 
equation  [16]  for  which  the  Lyapunov  dimension 
is  known  to  scale  with  the  system  size  [48].  Using 
the  same  criterion  as  that  used  on  experimental 
data,  we  find  that  M,  computed  with  K.L.,  is 
very  close  to  the  Lyapunov  dimension,  which 
was  about  30  for  the  specific  case.  More  details 
about  this  numerical  test  can  be  found  in  ref. 

[32]. 

There  is  no  ambiguity  in  the  choice  of  spatial  points 
because  the  different  points,  either  in  real  or  Fourier  space, 
can  be  considered  as  independent  variables  of  the  system  and 
this  is  not  in  general  true  for  delayed  coordinates. 


The  numerical  results  [32,  34,  35,  45b|  in 
spatiotemporal  system  (namely  partial  differen¬ 
tial  equations  and  coupled  maps)  indicate  that 
there  is  a  correlation  between  the  value  of  M  and 
the  number  of  main  degrees  of  freedom  (in  the 
sense  of  Lyapunov  dimension)  involved  in  the 
dynamics.  There  is  not  a  clear  explanation  of  the 
correlation  [35,45b]  because  the  value  obtained 
by  K.L.  is  a  linear  estimate.  However,  from 
numerical  tests  one  could  conclude  that  the  sys¬ 
tem  spends  a  large  amount  of  time  in  a  linear 
subspace  and  only  occasionally  does  it  visit  the 
other  part  of  the  phase  space.  This  can  be 
another  way  of  describing  the  strong  inter- 
mittency  observed  in  spatiotemporal  chaos.  Be¬ 
cause  of  the  relationship  between  M  and  the 
Lyapunov  dimension  observed  in  numerical  ex¬ 
periments,  we  assume  that  the  value  of  M  can  be 
considered  as  a  rough  estimate  of  the  number  of 
the  main  degrees  of  freedom  involved  in  the 
dynamics. 

If  this  estimation  is  true,  we  can  claim  that 
probably  the  number  of  main  modes  involved  in 
the  spatiotemporal  chaotic  regimes  of  our  sys¬ 
tems  is  about  80.  This  result  is  rather  interesting 
because  it  shows  that  important  statistical  fea¬ 
tures,  like  those  shown  in  section  4,  may  be 
found  even  with  a  rather  small  number  of  modes 
involved  in  the  dynamics. 


6.  Conclusions 

Using  the  results  of  an  experiment  on 
Rayleigh-Benard  convection  in  an  annular 
geometry,  we  have  reviewed  several  methods 
that  can  be  used  in  order  to  analyse  the  transi¬ 
tion  to  spatiotemporal  chaos.  The  example  that 
we  have  chosen  is  very  useful  to  investigate  the 
transition  from  low  dimensional  chaos  to  weak 
turbulence  because  its  behaviour  is  rather  similar 
to  the  one  observed  in  many  other  one  dimen¬ 
sional  systems. 

In  summary,  the  onset  of  spatiotemporal  inter- 
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mittency  in  our  cell  displays  features  of  a  phase 
transition  that  is  reminiscent  of  a  percolation. 
This  result  has  been  obtained  by  reducing  the 
space-time  dynamics  to  a  binary  code,  which 
captures  the  relevant  features  of  the  phenom¬ 
enon.  The  universality  class  to  which  this  phe¬ 
nomenon  belongs  is  not  detectable  in  a  labora¬ 
tory  experiment  because  of  the  strong  finite  size 
effects.  However,  accurate  numerical  simulations 
on  cellular  automata  show  that  it  could  be  a 
percolation  [19, 15b]. 

Above  the  transition  point  to  space-time 
chaos  the  system  displays  thermodynamic  prop¬ 
erties:  the  Fourier  mode  amplitudes  have  Gaus¬ 
sian  distributions,  the  energy  of  the  system  scales 
as  an  additive  thermodynamic  quantity  and  the 
fluctuation-dissipation  theorem  is  satisfied  for  the 
energy.  Furthermore,  a  thermodynamic  tem¬ 
perature,  which  is  proportional  to  the  control 
parameter,  can  be  defined  using  the  natural  fluc¬ 
tuations  of  the  system.  The  most  relevant  fact  is 
that  a  phase  transition-like  behaviour  is  recov¬ 
ered  in  the  global  variables  of  the  system,  just  as 
in  the  local  ones. 

Finally,  a  rough  estimate  of  the  number  of 
degrees  of  freedom  (in  the  sense  of  Lyapunov 
dimension)  involved  in  the  dynamics  shows  that 
such  a  statistical  behaviour  is  produced  by  the 
interaction  of  about  80  modes.  This  estimate  has 
been  given  using  the  Karhunen-Loeve  decompo¬ 
sition  that  allows  us  to  give  also  some  informa¬ 
tion  on  the  most  important  spatial  structures  of 
the  dynamics  [32]. 

The  limited  number  of  modes  involved  and  the 
thermodynamic  properties  of  this  system  clearly 
show  the  importance  of  non-standard  statistics 
[49]  in  describing  such  phenomena. 

This  review  on  thermodynamic  properties  of 
space-time  chaos  is  far  from  complete  because  at 
least  two  other  important  aspects  have  been  left 
out.  One  concerns  two  dimensional  systems,  the 
other  defect-induced  turbulence.  Information 
about  these  features  can  be  found  in  other 
papers  [1,5, 18]. 
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The  invariant  measure  associated  with  a  chain  of  coupled  maps  is  investigated  by  performing  a  local  orthogonal 
decomposition.  The  same  approach  is  also  applied  to  the  tangent  space. 


1.  Introduction 

The  investigation  of  spatio-temporal  chaos  has 
recently  attracted  much  interest.  There  is  indeed 
hope  that  its  understanding  will  shed  new  light 
on  the  evolution  of  systems  with  many  degrees  of 
freedom.  In  particular,  a  very  general  question 
arises  as  to  whether  a  truly  random  signal  can  be 
distinguished  from  an  infinite-dimensional  cha¬ 
otic  one. 

Statistical  mechanics  has  provided  the  most 
powerful  tool  to  describe  low-dimensional 
strange  attractors  [1].  Likewise,  we  expect  it  to 
play  an  essential  role  also  in  the  characterization 
of  spatio-temporal  chaos.  However,  we  are  still 
far  from  a  satisfactory  development  of  this  line 
of  thought.  Rigorous  results  have  been  estab¬ 
lished  only  in  the  limited  case  of  a  chain  of 
everywhere  expanding  maps,  when  the  invariant 
measure  covers  the  whole  phase  space  [2].  In 
more  realistic  dissipative  systems,  difficulties 
arise  which  prevent  a  straightforward  application 
of  statistical-mechanics  concepts.  For  instance, 
the  exponential  contraction  of  volumes  enforces 
the  overall  attractor  to  fill  a  lower-dimensional 
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manifold.  Hence,  unlike  in  Hamiltonian  systems, 
here,  many  degrees  of  freedom  remain  inactive 
in  the  asymptotic  evolution,  yielding  no  contri¬ 
bution  to  the  stationary  properties. 

More  specifically,  let  us  refer  to  a  chain  of 
coupled  one-dimensional  maps.  The  state  of  the 
system  is  represented  by  the  time-dependent  sca¬ 
lar  field  x',+^  obeying  the  equation  [3] 

^',+1  -F  (1  -  (t)x',  +  jo-xf)  ,  (1.1) 

where  /  and  t  are  discrete  space  and  time  vari¬ 
ables,  respectively;  /( )  is  a  map  of  the  interval 
into  itself,  and  o-  is  a  parameter  controlling  the 
strength  of  the  diffusive  coupling. 

Previous  simulations  in  spatially  extended  sys¬ 
tems  have  shown  that  the  fraction  r'(A)  of  the 
number  of  Lyapunov  exponents  larger  than  A 
tend  to  be  independent  of  the  chain  length  L  for 
L-^oo  (thermodynamic  limit)  [4].  In  other 
words,  the  spectrum  of  Lyapunov  exponents  is  a 
well-defined  entity.  The  existence  of  a  limit  spec¬ 
trum  straightforwardly  implies  that  the  dimen¬ 
sion  D  of  the  attractor  (as  determined  from  the 
Kaplan-Yorke  formula)  is  an  extensive  quantity, 
linearly  dependent  on  the  length  L 

D  =  pL,  (1.2) 
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where  p  is  the  so-called  density  of  dimensions.  In 
the  case  of  non-invertible  dynamics,  the  sum  of 
all  Lyapunov  exponents  can  remain  strictly  posi¬ 
tive,  so  that  Kaplan-Yorke  formula  cannot  apply 
[this  is  for  instance  the  case  of  eq.  (1.1)  for  a 
sufficiently  small  coupling  strength  a].  In  order 
to  avoid  such  problems,  we  have  always  chosen 
parameter  values  such  that  a  meaningful 
Lyapunov  dimension  can  be  determined  (e.g. 
tr  =  5  and  f{x)  =  2-  x^,  yielding  a  dimension 
density  p  =  0.6). 

The  state  of  affairs  is  much  less  clear  when  we 
pass  from  closed  chains  (as  above)  to  sub-chains 
of  an -in  principle  -  infinite  lattice  [5].  This 
second  approach  somehow  corresponds  to  the 
canonical-ensemble  picture  of  statistical  mech¬ 
anics:  the  system  of  interest  (sub-chain  of  length 
£)  is  coupled  with  a  thermal  bath  given  by  the 
rest  of  the  chain.  From  the  previous  considera¬ 
tions,  the  attractor  corresponding  to  an  isolated 
system  would  fill,  for  E  sufficiently  large,  a  pE- 
dimensional  manifold  (let  us  forget  about  multi¬ 
fractal  corrections,  whose  presence  does  not  af¬ 
fect  the  validity  of  the  following  argument).  The 
main  effect  of  the  coupling  with  the  heat  bath  is 
to  add  a  sort  of  “external  noise”  dressing  the 
manifold  along  all  directions,  and  thus  making 
the  resulting  invariant  measure  to  become  E- 
dimensional.  However,  this  seemingly  trivial  re¬ 
sult  does  not  provide  a  complete  information 
about  the  scaling  behaviour  of  the  probability 
density.  In  fact,  in  the  case  of  spatially  extended 
systems,  two  independent  scaling  parameters  are 
to  be  taken  into  account:  the  size  e  of  the  boxes 
used  to  cover  the  phase  space,  and  the  length  E 
of  the  chain.  The  coarse-grained  dimension 
D(e,  E)  is  a  function  of  both  parameters.  The 
previous  argument,  together  with  careful  nu¬ 
merical  simulations  [6],  strongly  support  the  hy¬ 
pothesis  that,  taking  first  the  limit  £— >0  (as 
required  by  the  definition  of  dimension), 
D{e,  E)  converges  to  E  itself.  Instead,  it  is  not 
clear  what  is  to  be  expected  if  we  first  take  the 
limit  E— »<»  (i.e.  we  observe  the  chain  at  fixed, 
finite,  resolution).  Numerical  simulations  pre¬ 


sented  below  in  this  paper  empower  the  c  )njec- 
ture  that 

D(£,  E)  =  p,(£)E,  (1.3) 

where  p^  is  close  to  the  dimension  density  p 
defined  in  eq.  (1.2).  A  rough  theoretical  argu¬ 
ment  is  presented  to  justify  this  conclusion.  Al¬ 
though  all  the  numerical  results  presented  in  this 
paper  have  been  obtained  from  simulations  with 
logistic  maps  for  the  same  parameter  values,  test 
runs  made  for  different  parameters  and  different 
maps  indicate  that  they  are  more  general. 

The  plan  of  the  paper  is  as  follows.  First  we 
introduce  local  orthogonal  decomposition  (OD), 
then  we  apply  it  to  a  chain  of  maps  to  extract 
information  about  the  nonuniformity  of  prob¬ 
ability  distribution  and  to  better  comprehend  the 
reason  of  the  slow  convergence  exhibited  by 
D(e,  E).  In  section  3  we  put  forward  the  first 
elements  of  a  possible  application  of  OD  to 
tangent  space,  as  well.  The  hope  is  to  be  finally 
able  to  obtain  a  more  detailed  local  description 
of  the  invariant  measure,  and  perhaps  construct 
more  rigorous  theoretical  arguments  about  the 
structure  of  the  invariant  measure. 

2.  Local  orthogonal  decomposition 

The  idea  of  analysing  the  possible  states  of  a 
sub-chain  of  length  E  requires  to  construct  the 
vectors 

v,=(x:,xr' . ^r^-'),  (2.1) 

and  then  to  investigate  their  probability  dis¬ 
tribution  in  the  corresponding  E-dimensional 
space.  This  sort  of  spatial-embedding  procedure 
is  equivalent  to  projecting  the  invariant  measure 
of  the  -  in  principle  -  infinite  chain  down  onto  an 
E-dimensional  space.  Previous  numerical  simula¬ 
tions  revealed  an  increasingly  slow  convergence 
of  D(e,  E)  to  its  limit  value  for  increasing  E  [6]. 
These  results  have  been  interpreted  as  the  indi- 
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cation  that  the  invariant  measure  covers  an  in¬ 
creasingly  thin  subset  along  suitable  directions  in 
the  embedding  space.  Here,  we  further  investi¬ 
gate  this  point  by  performing  a  local  orthogonal 
decomposition.  Let  us  first  recall  the  standard 
global  implementation  of  OD  [7].  Being  v,  a 
vector  lying  in  a  given  embedding  space,  we  start 
computing  the  correlation  operator 

=  -  {v]){vi)  ,  j  ,  (2.2) 

where  (  )  indicates  the  average  with  respect  to 
the  actual  measure.  The  eigenvalues  A„  of  the 
matrix  Kj  j  can  be  interpreted  as  the  average 
square  sizes  of  the  distribution  along  the  princi¬ 
pal  axes,  i.e.  along  the  corresponding  eigenvec¬ 
tors  ift„.  In  the  case  of  a  probability  distribution 
uniformly  filling  a  thin  parallelepiped,  the  rel¬ 
evant  information  about  the  thickness  of  the 
support  is  revealed  by  OD  in  terms  of  the  exist¬ 
ence  of  a  small  eigenvalue.  After  deforming  the 
parallelepiped  to  a  still  thin,  but  curved  man¬ 
ifold,  such  an  information  is  immediately  lost, 
the  true  local  thickness  being  masked  by  the 
nonlinear  character  of  the  support.  In  such  a 
case,  we  can  separately  apply  OD  to  different 
smaller  subsets  so  as  to  decrease  nonlinear  ef¬ 
fects. 

More  precisely,  we  restrict  the  averages  in  eq. 
(2.2)  to  the  points  falling  inside  the  same  box 
B{r,  e),  with  r  representing  its  (randomly 
chosen)  center,  and  e  its  size.  In  order  to  con¬ 
struct  global  indicators,  the  box-dependent 
eigenvalues  are  then  averaged  over  different  ref¬ 
erence  points.  Feasibility  of  the  simulations  im¬ 
peded  us  to  handle  more  than  200  such  points. 

It  is  not  difficult  to  show  that  in  case  of  a 
uniform  distribution  in  a  box  of  size  e,  all  the 
eigenvalues  are  equal 

(2.3) 

Therefore,  it  is  convenient  to  introduce  the  re¬ 
scaled  variables 


which  immediately  provide  information  about 
the  deviation  from  a  uniform  distribution.  Actu¬ 
ally,  the  dependence  of  A„  on  n,  more  than 
estimating  the  nonuniformity,  characterizes  local 
anisotropy.  However,  the  average  over  different 
reference  points  present  in  eq.  (2.4)  allows  one 
to  obtain  true  information  about  the  nonunifor¬ 
mity.  Think  for  instance  of  a  radial-dependent 
spherical-symmetric  distribution.  OD  applied  to 
a  box  set  around  the  center  of  the  sphere  does 
not  detect  any  nonuniformity,  but  any  other 
choice  of  the  reference  point  reveals  nonunifor¬ 
mity  as  a  local  anisotropy. 

For  any  smooth  probability  distribution  ap¬ 
pears  more  and  more  uniform  when  the  resolu¬ 
tion  is  increased,  all  w„’s  are  expected  to  con¬ 
verge  to  1  for  c— »0.  This  cannot  be  the  case  of 
the  logistic  map  (f{x)  =  a-x'),  as  a  uniform 
density  around  x  =  0  is  transformed  into  an  in¬ 
verse-square  root  singularity  around  the  ma,xi- 
mum.  In  a  chain  of  logistic  maps,  the  same 
phenomenon  leads  to  the  Cartesian  product  of 
many  such  singularities.  In  order  to  get  rid  of 
this  problem  (which  is  not  the  main  cause  of  the 
slow  convergence  of  coarse-grained  dimensions 
[6]),  we  have  numerically  performed  a  change  of 
coordinate  from  x  to  y,  transforming  the  single¬ 
site  probability  density  F(x)  dx  into  a  constant 
distribution. 

As  already  mentioned  in  the  introduction,  we 
have  performed  the  numerical  simulations  for 
a  =  2  and  cr  =  ^  when  fully  developed  spatio- 
temporal  chaos  is  present  with  an  exponential 
decay  of  spatial  correlations.  This  is  a  case  which 
exhibits  an  increasingly  slow  convergence  of  the 
coarse-grained  fractal  dimension  [6].  It  is  worth 
mentioning  that  such  phenomenon  does  not  ap¬ 
pear  to  be  pathological  of  these  parameter  val¬ 
ues,  as  the  same  behaviour  has  been  also  ob¬ 


served  in  simulations  performed  with  different 
maps  (e.g.  tent  map)  [8]. 

In  fig.  1  we  have  reported  the  dependence  of 
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Fig.  1.  Dependence  of  the  nonuniformity  coefficients  w„ 
arising  from  the  application  of  orthogonal  decomposition  to 
the  probability  densities  in  boxes  of  different  sizes  e.  The 
simulations  have  been  performed  for  a  chain  of  logistic  maps, 
(/(Jt)  =  2  -  x^  with  a  coupling  strength  o-  =  ).  The  dashed 
horizontal  curve  denotes  the  asymptotic  value  corresponding 
to  a  uniform  distribution. 


w„  on  e  for  £■  =  5,  7  and  9.  The  results  have  been 
obtained  by  iterating  a  chain  of  length  L  =  1000 
and  processing  2  x  lO’  data  points.  Some  curves 
show  a  clear-cut  convergence  towards  the  expec¬ 
ted  asymptotic  value  w„  =  \.  In  the  other  cases 
the  “large  scale”  convergence  is  followed  by  a 
“small  scale”  pseudo-divergence.  The  latter  phe¬ 
nomenon  turns  out  to  be  the  consequence  of  a 
systematic  error  originated  from  the  low  statistics 
in  the  smaller  boxes.  In  fact,  simulations  per¬ 


formed  with  fewer  data  showed  the  same  phe¬ 
nomenon  already  for  bigger  boxes. 

Moreover,  notice  that  the  smallest  "thickness" 
w,  decreases  for  increasing  embedding  dimen¬ 
sion.  To  better  elucidate  the  dependence  of  vv„ 
on  £,  let  us  introduce  the  notion  of  nonuniformi¬ 
ty  spectrum 

s  «/£)  =  w,,  .  (2.5) 

Analogously  to  the  spectrum  of  Lyapunov  expo¬ 
nents,  we  expect  that,  in  the  thermodynamic 
limit,  all  spectra  should  tend  to  superimpose 
when  plotted  according  to  eq.  (2.5).  From  fig.  2, 
where  we  have  reported  the  results  obtained  for 
different  chain  lengths  with  global  OD.  we  ob¬ 
serve  indeed  a  convergence  towards  a  limit 
shape.  Notice  that  the  spectrum  plotted  in  fig.  2 
is  [apart  from  the  normalization  of  eq.  (2.4), 
which  simply  leads  to  a  vertical  shift  in  a  logar¬ 
ithmic  scale,  and  from  a  possible  reordering  of 
the  eigenvalues]  the  spatial  Fourier  spectrum.  In 
fact,  whenever  translational  invariance  holds,  the 
principal  eigenvectors  coincide  with  Fourier 
modes.  However,  this  is  no  longer  true  when  we 
consider  smaller  boxes. 

When  the  embedding  dimension  is  increased, 
it  soon  becomes  unfeasable  to  generate  enough 
points  to  fill  a  small  box  in  a  realistic  amount  of 
CPU  time.  Therefore,  it  is  crucial  to  take  into 
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Fig.  2.  Nonuniformity  spectra,  as  arising  from  global  OD. 
for  increasing  chain  lengths. 


388 


A.  PoUti,  G.P.  Puccioni  /  Invariant  measure  in  coupled  maps 


account  finite-size  effects  to  better  extrapolate 
the  asymptotic  results  already  from  short-length 
simulations.  It  is  reasonable  to  assume  that  fi¬ 
nite-size  effects  can  be  accounted  for  by  a  power- 
series  expansion  in  the  smallness  parameter  1/  E. 
Retaining  only  the  first  order  [9],  we  have 


where  the  coefficient  a  is  determined  so  as  to 
yield  the  best  superposition  of  the  spectra  in  the 
small- V  region  (since  we  are  interested  to  ex¬ 
trapolate  the  minimum  value  W(0)).  The  results, 
plotted  in  fig.  3  for  two  different  resolutions, 
clearly  reveal  a  tendency  to  converge  to  a  limit 
shape. 

Let  us  now  try  to  interpret  these  result  in  the 
spirit  of  the  conjecture  raised  in  ref.  [6].  First  of 
all,  W(0),  for  any  finite  choice  of  the  box-size  e, 
is  presumably  limited  from  below  by  non- 
linearities  (unless  nonlinear  corrections  tend  to 


Fig.  3.  Nonuniformity  spectra  for  two  different  resolutions: 
t  =  0.42  (a),  0.18  (b)  (e  is  scaled  in  such  a  way  that  e  =  I 
corresponds  to  the  actual  size  of  the  set).  The  dashed  lines 
extrapolate  the  curves  towards  W{0). 


vanish  for  E—rx),  Therefore,  if  the  width  of  the 
distribution  locally  decreases  for  E-*x,  W(()) 
should  decrease  for  f  ^0,  staying  however  finite 
as  long  as  e  >0.  If  instead,  the  distribution  were 
uniformly  filling  an  ellipsoid,  then  ^(0)  would 
increase  as  e',  when  the  resolution  is  initially 
increased  (see  definition  (2.3)). 

From  the  results  of  figs.  2  and  3,  we  see  that 
W(0)  neither  decreases  nor  increases  as  fast  as 
e".  This  means  that  nonlinear  effects  are  very 
important  in  lowering  the  convergence  of  fractal 
dimension  estimates,  but  they  do  not  seem  to  be 
so  strong  as  to  justify  the  existence  of  an  increas¬ 
ingly  thin  manifold. 

However,  there  is  a  simple  argument  in  favour 
of  the  latter  conjecture.  As  already  mentioned  in 
the  Introduction,  the  evolution  of  a  sub-chain  of 
length  E  can  be  seen  as  a  deterministic  evolution 
inside  an  £-dimensional  phase  space,  plus  a  sort 
of  random  noise,  due  to  the  coupling  with  the 
rest  of  the  chain.  As  this  coupling  acts  only  at 
the  boundaries,  its  overall  size  is  independent  of 
the  chain-length.  In  order  to  clarify  the  role 
played  by  such  a  "noise”,  it  is  useful  to  project  it 
along  the  position-dependent  stable  and  unstable 
directions.  If  the  invariant  manifolds  were  likely 
to  assume  all  possible  directions,  then  the  am¬ 
plitude  of  the  “noise”  along  each  direction 
should,  on  the  average,  be  of  the  order  of  1  /V£. 
On  the  one  hand,  a  small  noise  along  an  unstable 
direction  is  amplified  but,  as  the  attractor  is 
already  continuous  along  such  a  direction,  its 
effect  on  the  invariant  measure  is  not  quali¬ 
tatively  relevant.  On  the  other  hand,  a  noise 
acting  along  a  stable  direction  is  damped,  but  it 
is  sufficient  to  provide  a  finite  width  to  the 
probability  density.  However,  as  the  “noise” 
amplitude  goes  to  0  for  E— the  thickness 
does  the  same.  Actually,  it  is  not  true  that  the 
manifold  points  equally  likely  along  all  direc¬ 
tions,  as  the  Lyapunov  vectors  are  more  or  less 
localized  [9],  but  this  cannot  affect  the  previous 
argument.  In  fact,  since  the  overall  intensity  of 
the  coupling  is  of  order  C(l),  there  can  be  at 
most  a  finite  number  of  stable  directions  charac- 
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terized  by  a  noise  amplitude  independent  of  E. 
Now,  a  finite  number  of  directions  escaping  our 
previous  arguments  does  not  make  a  relevant 
difference  in  the  thermodynamic  limit.  On  the 
contrary,  it  is  in  principle  possible  to  imagine 
invariant  manifolds  pointing  along  quasi-forbid¬ 
den  directions  and  characterized  by  a  noise  am¬ 
plitude  decreasing  faster  than  llyfE. 

Our  numerical  simulations  having  shown  a 
slowly  increasing  W(0),  are  not  in  agreement 
with  the  previous  analysis.  However,  we  must 
recall  that  the  above  arguments  are  entirely 
based  onto  a  linear  analysis,  while  it  is  not 
obvious  whether  nonlinear  terms  are  negligible 
at  the  relatively  short  embedding  dimensions 
used  in  the  simulations.  In  fact,  we  have  been 
able  to  reach  at  most  £  =  13,  which  means  an 
average  noise  amplitude  about  2\f^  =  0.554 
(there  are  two  forcing  terms  acting  at  the  ex¬ 
trema  of  each  sub-chain). 

Before  passing  to  a  direct  investigation  of  the 
invariant  manifold,  let  us  comment  about  the 
convergence  of  the  fractal-dimension  estimates. 
From  the  knowledge  of  the  density  of  points  in 
the  various  boxes,  it  is  obviously  possible  to 
measure  coarse-grained  dimensions.  The  average 
value  is  plotted  in  fig.  4  versus  E,  for  two 
different  resolutions.  In  both  cases  we  observe 


Fig.  4.  Coarse  grained  dimension  D  versus  the  embedding 
dimension  E  for  two  different  resolutions:  e  =  0.42  (squares), 
0.18  (triangles).  The  error  bars  are  drawn  only  when  the 
uncertainty  is  larger  than  the  size  of  the  symbols. 


an  almost  linear  increase  with  a  slope  about  0.53 
and  0.56,  respectively.  Both  slopes  are  definitely 
smaller  than  1,  the  value  that  we  would  have 
obtained  by  first  taking  the  limit  Further¬ 

more,  the  slope  is  close  to  the  density  of  dimen¬ 
sions  (=0.6),  i.e.  the  value  that  we  would  have 
expected  for  an  isolated  chain.  A  possible  expla¬ 
nation  of  this  fact  goes  as  follows.  We  have  seen 
that  the  effect  of  external  noise  is  confined  to 
small  length  scales,  the  smallness  depending  on 
the  length  E  of  the  chain.  Therefore,  it  is  not 
possible  to  distinguish  the  probability  density 
observed  for  E  sufficiently  large,  from  that  one 
of  an  isolated  chain.  Accordingly,  D(e.  E)  —  pE. 
i.e.  eq.  (1.3)  holds,  together  with  the  equality 
p,  =  p.  As  a  consequence,  for  e  sufficiently  small, 
one  should  be  able  to  detect  two  distinct  regimes 
in  the  behaviour  of  D(e,  £);  (i)  a  small-£  re¬ 
gion,  where  the  resolution  is  sufficient  to  reveal 
the  presence  of  the  coupling,  characterized  by 
the  scaling  behaviour  D{e,  E)=r  E\  (ii)  a  large-£ 
region,  where  the  asymptotic  law  is  found.  Some 
evidence  of  this  behaviour  is  shown  in  fig.  4, 
where  the  curve  obtained  for  f  =  0.18  (triangles) 
exhibits  a  somehow  larger  slope  at  small  £- 
values  (see  dashed  curve).  A  more  clear  evidence 
of  a  cross-over  between  two  different  regimes 
could  be  reached  for  smaller  boxes.  Unfortu¬ 
nately,  this  is  practically  unfeasible,  because  of 
the  enormous  number  of  data  points  required  to 
fill  such  boxes. 

Let  us  conclude  this  section,  by  noting  that  eq. 
(1.3)  provides,  in  principle,  a  tool  to  distinguish 
an  infinite-dimensional  chaotic  signal  (C)  from  a 
random  one  (R).  In  fact,  by  interpreting  the 
spatial  variable  i  as  a  time  variable,  the  embed¬ 
ding  procedure  sketched  in  eq.  (2.1)  becomes 
the  typical  temporal-embedding  technique  used 
to  reconstruct  strange  attractors,  and  our  signal 
can  be  interpreted  as  an  infinite¬ 
dimensional  chaotic  signal.  In  both  C-  and  R- 
cases,  the  fractal  dimension  D(0,  £)  coincides 
with  the  embedding  dimension  itself.  However, 
for  a  random  signal,  we  expect  that,  instead  of 
eq.  (1.3),  the  equality  D(e,  £)  =  £  again  holds 
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at  sufficiently  high  resolutions.  This  answers  the 
question  formulated  in  ref.  [10]  and  recalled  at 
the  beginning  of  this  paper. 

3.  Orthogonal  decomposition  in  tangent  space 

In  the  previous  section  we  have  applied  local 
OD  to  the  invariant  measure  in  the  phase  space. 
We  have  seen  that,  on  the  one  hand,  this  ap¬ 
proach  is  able  to  confirm  the  slow  convergence 
of  fractal  dimension  estimates  but,  on  the  other 
hand,  it  does  not  lead  to  reliable  conclusions 
about  the  local  structure  of  the  probability  dis¬ 
tribution  for  increasing  embedding  dimensions. 

From  the  analysis  of  low-dimensional  chaos,  it 
is  well  known  that  Lyapunov  exponents  contain 
enough  information  so  as  to  provide  a  complete 
characterization  of  the  chaotic  properties  (e.g. 
fractal  dimension  and  metric  entropy).  In  the 
following  we  shall  combine  such  a  philosophy 
with  the  ideas  expounded  in  section  2,  to  de¬ 
scribe  spatially  extended  systems. 

Let  us  consider  a  chain  of  length  L  with  a 
Lyapunov  dimension  pL.  The  implementation  of 
Kaplan- Yorke  formula  implicitly  associates  a 
partial  dimension  to  each  invariant  (either  stable, 
or  unstable)  direction.  More  precisely,  the  direc¬ 
tions  corresponding  to  the  first  /  =  [pL]  (i.e.  the 
integer  part  oipL)  Lyapunov  vectors  are  charac¬ 
terized  by  partial  dimensions  equal  to  1  (such 
directions  include  the  whole  unstable  manifold 
plus  some  of  the  less  contracting  directions).  The 
(/  -t- 1  )st  direction  is  characterized  by  a  fractal 
structure,  and  the  remaining  stable  directions  are 
to  be  associated  with  a  zero  partial  dimension. 

Here,  we  make  the  stronger  assumption  that 
the  invariant  measure  covers  smoothly  a  regular 
pL-dimensional  manifold.  In  other  words,  we 
neglect  the  single  direction  characterized  by  a 
non-integer  dimension:  this  approximation  does 
not  appear  to  be  so  severe  in  the  thermodynamic 
limit.  Moreover,  we  neglect  multifractal  correc¬ 
tions.  They  are  not  small  in  principle,  since  the 
fluctuations  of  pointwisc  dimension  arc  propor¬ 


tional  to  the  chain  length  jllj.  However,  the 
volume  of  the  regions  characterized  by  a  dimen¬ 
sion  value  other  than  the  average  one  is  asymp¬ 
totically  0.  Finally,  if  a  stable  direction  is  charac¬ 
terized  by  a  zero  partial  dimension,  it  is  not 
necessarily  true  that  the  attractor  has  a  point-like 
structure  along  such  a  direction.  We  do  not  know 
if  such  anomaly  is  going  to  occur  genetically  in 
high  dimensional  systems.  In  the  lack  of  rigorous 
statements  we  use  the  Occam's  razor,  making 
the  simplest  assumption. 

We  now  go  back  to  the  problem  of  projecting 
locally  the  invariant  measure,  to  exploit  the  as¬ 
sumption  about  the  smoothness  of  the  probabili¬ 
ty  density.  More  precisely,  we  assume  that  the 
distribution  around  a  generic  point  X  = 
(jc,, .  .  .  ,  Xj^)  covers  uniformly  a  pL-dimensional 
subspace  identified  by  the  corresponding 
Lyapunov  vectors  u,.  The  projection  P  is  simply 
performed  by  retaining  E  spatially  consecutive 
coordinates  of  the  point  X  and  of  the  Lyapunov 
vectors.  As  a  result,  we  determine  the  directions 
spanned  by  the  probability  distributions  in  the 
embedding  space.  Obviously,  since  pL  >  £,  the 
resulting  vectors  overspan  the  E-dimensional 
space.  However,  nothing  is  known  a  priori  about 
the  directions  exhibited  by  the  projected  vectors 
Pu^.  To  investigate  this  point,  we  have  inter¬ 
preted  such  vectors  as  points  in  an  £-dimension- 
al  space  and  applied  OD  to  them.  The  results 
(averaged  over  different  reference  points)  arc 
shown  in  fig.  5,  where  the  smallest  coefficient  w, 
is  plotted  versus  E  for  two  different  chain 
lengths.  The  clean  straight-line  behaviour  indi¬ 
cates  an  exponential  decrease  of  w,.  This  means 
that  the  projection  of  a  pL-dimensional  hy¬ 
percube  onto  an  E-  dimensional  space,  is  mostly 
concentrated  along  suitable  directions,  while 
being  very  thin  along  some  other  ones.  How¬ 
ever,  this  result  proves  only  a  necessary,  but  far 
from  sufficient,  condition  to  explain  the  exist¬ 
ence  of  directions  characterized  by  a  small  thick¬ 
ness.  In  fact,  in  the  above  analysis,  we  have 
implicitly  assumed  that  the  probability  density 
around  a  generic  point  in  the  embedding 
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Fig.  5.  Smallest  eigenvalue  tv,  (from  the  projection  of 
Lyapunov  vectors)  versus  the  dimension  £  of  the  projection 
space  for  different  chain  length  L. 


space  derives  from  the  projection  of  the  in¬ 
variant  measure  around  just  one  point  5^  in  the 
L-dimensional  space  (S£  =  PS^).  This  is  not 
true,  for  there  are  infinitely  many  different  5^ 
points  in  the  global  phase  space,  whose  projec¬ 
tion  yields  the  same  point  in  the  embedding 
space.  Therefore,  one  should  superimpose  all 
such  distributions  before  applying  OD  to  the 


Lyapunov  vectors.  The  computation  of  the  re¬ 
sulting  nonuniformity  spectra  is  a  much  more 
cumbersome  task,  which  we  plan  to  undertake  in 
the  near  future. 
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PDE  simulations  for  the  Kolmogorov  flow  are  analyzed  in  terms  of  phase-space  concepts.  The  tool  used  is  the  proper 
orthogonal  decomposition  method  which  extracts  coherent  structures  and  prominent  features  of  a  random  or  turbulent 
dataset.  We  analyze  a  quasiperiodic  regime  and  an  intermittent  regime.  We  derive  two  eigenfunctions  that  determine  the 
dynamics  and  structure  of  the  quasiperiodic  case  and  And  a  third  one  associated  with  the  unstable  manifold  of  the  bursts  of 
the  intermittent  regime.  Calculations  are  performed  for  streamfunction  data  and  vorticity  data  which  show  substantial 
differences.  It  is  argued  that  the  streamfunction  data  demonstrate  the  low  dimensional  phase-space  dynamics  of  the  large 
scales  whereas  the  vorticity  data  show  an  enstrophy  cascade. 


1.  Introduction 

Recent  large  scale  simulations  of  periodically 
forced  2D  Navier-Stokes  equations  (Kol¬ 
mogorov  flow)  have  revealed  a  quasiturbulent 
regime  characterized  by  a  periodically  modulated 
drifting  laminar  flow  interspersed  by  unpredict¬ 
able,  sometimes  violent  eruptions  of  turbulence 
[1].  In  the  spirit  of  recent  approaches  to  connect 
spatially  and  temporally  complex  behaviour  to 
finite  dimensional  attractors,  one  hopes  to  ex¬ 
plain  those  simulations  through  structurally  sta¬ 
ble  features  of  dynamical  systems.  An  important 
ingredient  for  such  an  idea  is  the  symmetry 
structure  of  these  simulations.  The  latter  are 
performed  with  periodic  boundary  conditions 
which,  together  with  the  periodic  forcing  (wave¬ 
length  lit  Ik),  introduce  a  D*  x  0(2)  symmetry 
under  which  the  governing  equations  are 
equivariant.  Because  of  these  symmetry  con¬ 
straints,  our  working  hypothesis  for  the  turbulent 
bursts  is  that  they  are  governed  by  structurally 
stable  homoclinic  connections  between  the 


quasiperiodic  laminar  solutions,  and  phase- 
shifted  laminar  solutions  which  are  related  to 
each  other  via  group  operations  [2].  Unfortu¬ 
nately,  while  similar  behavior  appears  in  the 
Kuramoto-Sivashinsky  equation,  where  such  a 
regime  occurs  essentially  as  a  secondary  bifurca¬ 
tion  which  can  be  explained  by  a  two-mode 
analysis  [3],  our  intermittently  turbulent  regime 
occurs  after  many  previous  bifurcations  have 
happened  and  therefore  the  basic  flow  is  un¬ 
stable  to  many  modes.  Hence  a  group  theoretic 
analysis  would  involve  the  interaction  of  many 
different  representations  of  the  symmetry  group 
making  a  full  analysis  unfeasible. 

A  suitable  tool  to  extract  phase-space  informa¬ 
tion  out  of  large  scale  PDE  simulations  seems  to 
be  the  proper  orthogonal  decomposition  (POD) 
also  known  as  Karhunen-Loeve  decomposition 
[4,  5].  This  was  recently  demonstrated  for  the 
Kuramoto-Sivashinsky  equation  in  refs.  [6,  7]. 
Simulations  for  Kolmogorov  flow  in  refs.  (1,  8,  9] 
clearly  give  the  impression  of  a  finite,  in  fact  very 
low  dimensional  dynamics  for  this  regime.  Con- 


0167-2789/92/$05.00  ©  1992  -  Elsevier  Science  Publishers  B.V.  All  rights  reserved 


D.  Armhruster  el  al.  /  Phase-space  analysis  of  bursting  behavior  in  Kolmogorov  flow 


393 


sequently  our  long  term  goal  is  to  come  up  with 
a  reasonably  low  dimensional  dynamical  system 
derived  via  a  proper  orthogonal  decomposition 
and  amenable  to  traditional  phase-space  analy¬ 
sis.  While  we  have  not  yet  synthesized  such  a 
system  we  have  analyzed  the  simulations.  We 
think  our  preliminary  results  give  very  interest¬ 
ing  partial  answers  and  raise  some  new  and 
crucial  questions  about  the  nature  of  turbulence. 

Using  POD  we  extract  the  dominant  features 
of  the  data  as  spatial  eigenfunctions  of  the 
covariance  matrix  and  project  the  data  onto  the 
first  few  eigenfunctions.  Doing  this  for  a  time 
sequence  we  can  reconstruct  the  time  evolution 
of  the  flow  using  a  suitable  number  of  these 
eigenmodes.  Specifically  we  are  reconstructing 
two  regions  of  the  simulations:  The  region  be¬ 
lieved  to  be  predominantly  laminar  and  the  re¬ 
gion  containing  a  burst,  respectively.  While  the 
PDE  numerical  code  integrates  a  streamfunc- 
tion,  turbulence  theory  prefers  to  deal  with  vor- 
ticity.  Performing  a  POD  on  both  we  get  the 
surprising  result  that  the  dimensions  derived 
from  an  energy  criterion  on  the  POD  and  visual 
comparison  of  reconstructed  and  original  data, 
as  well  as  those  derived  from  an  embedding 
algorithm  [10],  do  not  coincide  for  stream-func¬ 
tion  and  vorticity  data.  We  offer  interpretations 
and  speculations  on  the  reasons  for  that  dis¬ 
crepancy  in  our  conclusion. 

Another  interesting  result  concerns  the  impli¬ 
cation  of  the  streamfunction  analysis  for  Kol¬ 
mogorov  flow:  We  confirm  that  the  laminar  flow 
can  be  described  as  a  modulated  travelling  wave. 
Its  basic  POD  eigenfunctions  are  calculated  and 
their  Fourier  components  are  known.  We  can 
also  quantify  the  dimension  of  the  burst  regime. 
In  particular  we  can  identify  modes  that  are 
associated  with  the  unstable  manifold  of  the 
modulated  travelling  wave. 

The  remainder  of  the  paper  is  organized  as 
follows:  In  section  2  we  describe  the  relevant 
features  of  the  bursting  regime  in  Kolmogorov 
flow.  Section  3  reviews  the  proper  orthogonal 
decomposition  insofar  as  it  applies  to  phase- 


space  analysis  of  partial  differential  equations. 
Section  4  reports  on  our  results  for  streamfunc¬ 
tion  data  and  section  5  deals  with  vorticity  data. 
The  conclusion  discusses  these  results  in  terms  of 
low  dimensional  phase-space  dynamics  versus 
enstrophy  cascade. 

2.  The  Kolmogorov  flow:  Bursting  regimes 

The  two  dimensional  Kolmogorov  flow  is  the 
solution  of  the  2D  Navier-Stokes  equation  with 
a  uni-directional  force  /=  cos  0).  It 
was  introduced  by  Kolmogorov  in  the  late  1950’s 
as  an  example  on  which  to  study  transition  to 
turbulence.  For  large  enough  viscosity  v,  the 
only  stable  flow  is  the  plane  parallel  periodic 
shear  flow  =  {kf  cos  k^y,  0),  usually  called  the 
“basic  Kolmogorov  flow”.  The  macroscopic 
Reynolds  number  of  the  basic  flow  is  easily 
found  to  be  \lv\  this  will  be  used  later  as  a  free 
parameter  to  define  the  bifurcation  sequence.  It 
was  shown  by  Meshalkin  and  Sinai  [11]  that 
large-scale  instabilities  are  present  for  Reynolds 
numbers  exceeding  a  critical  value,  V2.  This 
large-scale  instability  has  been  shown  by  Nenom- 
nyachtchyi  [12]  and  Sivashinky  [13]  to  be  of 
negative-viscosity  type,  in  the  sense  that  the 
basic  anisotropic  flow  generates  a  negative  vis¬ 
cosity  for  large  scale  perturbations.  In  a  2it- 
periodic  square  box  the  equations  are 

du  , 

—  +  u-Vu  +  Vp  =  V  \u  +  f  ,  ( 1 ) 

V-M  =  0,  (2) 

f  ={vk^f  cos  k^y,0)  ,  O^x,  y^l-rr  .  (3) 

Sequences  of  bifurcations  have  been  investigated 
in  refs.  [8, 14]  (see  also  ref.  [15].  Recently  inter¬ 
esting  transitions  that  occur  at  higher  Reynolds 
number  have  been  studied  in  refs.  [1,  16,  9]. 
They  lead  to  sparsely  distributed  bursts  in  time 
for  a  fairly  large  range  of  Reynolds  number  from 
a  threshold  of  about  Re  =  20.5  for  A:  =  8  to  about 
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Re  =  120.  The  most  striking  feature  of  this  tran¬ 
sition  is  that  the  bursts  generate  substantial  spa¬ 
tial  disorder  and  drive  developed  turbulence. 
Typically,  near  threshold  the  dynamics  follows  a 
long  laminar  regime,  then  undergoes  a  strong 
chaotic  burst,  then  seems  to  settle  down  to  a 
laminar  regime  at  the  same  level  as  before;  then 
another  explosion  follows.  Intervals  between 
bursts  are  not  constant  and  fluctuate  randomly. 
A  study  of  the  Fourier  modes  [1, 16,  9]  suggests 
that 

-  the  laminar  regime  can  be  described  by  a 
traveling  modulated  system  of  eddies  with  well 
defined  symmetries  (symmetry  of  a  square  lattice 
rotated  by  45°); 

-  successive  laminar  intervals  do  not  correspond 
to  identical  dynamical  states  but  rather  to  a 
sequence  of  states  mapped  onto  each  other 
under  some  group  action.  These  group  actions  in 
particular  act  on  the  phases  of  the  complex 
Fourier  coefficients  as  demonstrated  in  refs.  [1, 
16,  9]; 

-spatial  order  is  destroyed  during  the  bursts, 
with  spatial  dynamics  on  a  much  smaller  space- 
scale. 

In  refs.  [1,  16]  this  kind  of  behavior  is  dis¬ 
cussed  in  terms  of  heteroclinic  and  homoclinic 
loops  in  phase  space  involving  an  analysis  of  the 
symmetry  groups  of  the  system.  These  symmet¬ 
ries  enable  the  heteroclinic  loops  to  become 
structurally  stable  which  is  in  accord  with  the 
observation  that  the  bursting  persists  up  to  and 
beyond  Re  =  120  (for  kf  =  ^).  As  the  Reynolds 
number  increases,  the  time  between  bursts  be¬ 
comes  shorter,  so  that  the  coherent  vortices  ulti¬ 
mately  appear  only  during  brief  intermittencies 
and  the  bursts  dominate  nearly  all  of  the 
dynamics.  It  should  be  pointed  out  that  the 
Reynolds  number  Re  =  l/i'  is  only  a  macroscopic 
number.  Effective  (local)  Reynolds  numbers 
vary  from  500  (i'  =  5)  to  1500  (»'  =  lis)- 

In  this  work  we  focus  on  a  regime  just  beyond, 
yet  very  close  to  the  bifurcation  from  the  mod¬ 
ulated  travelling  wave;  specifically  Re  =  20.545, 
kf  =  8  (the  laminar  states  are  still  stable  at  Re  = 


Fig.  1.  Typical  data  vector  (snapshot)  showing  the  vorticity 
(z-direction)  above  an  (x,  y)  grid.  Note  the  two  circular 
eddies  in  the  upper  left  and  the  lower  right  corner  and  the 
remnants  of  the  period  8  basic  Kolmogorov  flow. 


20.540).  One  of  our  goals  is  to  obtain  more 
information  on  the  unstable  manifold  of  the 
hyperbolic  tori  in  a  situation  where  the  dynamics 
is  still  weakly  chaotic  and  remain  for  a  very  long 
time  in  a  neighborhood  of  the  tori.  Fig.  1  shows 
a  typical  plot  of  the  vorticity:  There  is  a  large- 
scale  structure  underlying  two  prominent  local¬ 
ized  eddies  in  the  upper  left  and  the  lower  right 
corner  and  a  diagonal  wavestructure.  Plots  of  the 
maximal  vorticity  against  time  in  typical  time 
series  at  Re  =  20.545  reveal  long  laminar  se¬ 
quences  with  “microbursts”  spaced  far  apart. 
There  is  no  phase  shift  before  and  after  such 
microbursts  and  the  dynamics  does  not  (yet) 
track  a  full  heteroclinic  connection.  Hence  by 
doing  a  POD  analysis  on  these  microbursts  we 
hope  to  be  able  to  reveal  the  trigger  mechanism 
for  the  chaotie  bursting. 


3.  The  Proper  Orthogonal  Decomposition 

This  procedure  has  been  proposed  by  Lumley 
[4]  and  subsequently  used  by  many  groups  to 
extract  coherent  structures  from  turbulent  data. 
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We  briefly  summarize  the  method  and  refer  the 
reader  to  ref.  [5]  for  details.  Basically  one  looks 
for  eigenfunctions  that  optimally  capture  the 
dynamics  of  a  certain  flow.  The  eigenfunctions 
have  the  interpretation  of  maximizing  an  aver¬ 
aged  quadratic  functional  which,  for  data  repre¬ 
senting  a  velocity,  corresponds  to  the  kinetic 
energy.  Therefore  we  generally  refer  to  the 
‘energy’  of  the  data.  Mathematically  equivalent, 
the  eigenfunctions  maximize  the  correlation  of 
the  flow  fields  in  each  coordinate  direction. 
Specifically,  if  u{x,  t)  represents  the  spatio- 
temporal  evolution  of  the  PDE  in  question,  then 
we  choose  ^|f^  such  that 

T 

A,  =  lim  ^  [  (lA,,  u)‘  dr  (4) 

T-*x  /  J 
0 

is  a  maximum  with  the  side  constraint  that 
(iA,,<A,)  =  l.  Proceeding  inductively,  the  result¬ 
ing  variational  problem  leads  to  a  Fredholm  type 
integral  equation 

y)<A(>’)  =  A(A(jr),  (5) 

where  the  kernel  K(x,  y)  =  {u{x,  t)  u{y,  r))  and 
the  brackets  denote  time  average.  Consider  a 
truncated  expansion  of  the  flow  in  terms  of  the 
namely 


u(x,t)^'Z  a„it)  ■  (fi) 

«  =  1 

Then  we  observe  that  the  modes  are  uncorre¬ 
lated  in  time  as  (fl,(r)  a^Cr))  =  and  further¬ 
more,  the  eigenvalue  A^  corresponds  to  the  statis¬ 
tical  variance  in  the  jth  coordinate  direction,  and 
is  maximal. 

We  face  two  problems  with  that  type  of  data 
analysis:  As  is  obvious  from  eq.  (4)  one  usually 
performs  time  averages  over  very  long  samples. 
However,  this  tends  to  obscure  short  lived  events 
like  the  intermittent  bursts,  and  it  smooths  out 
all  temporal  details  of  the  flow.  Therefore  we 
carefully  select  the  data  for  our  two  analyses 
(Fig.  2):  We  cut  the  data  containing  a  microburst 
into  two  pieces.  The  first  one  is  laminar  (by 
visual  inspection);  the  second  one  contains  a 
large  part  of  laminar  flow  but  also  includes  a 
burst.  Keeping  a  laminar  part  in  our  burst  time- 
frame  ensures  that  we  are  processing  those  data, 
that  come  from  small  exponential  growth  along 
the  unstable  manifold  of  the  torus. 

The  second  problem  concerns  the  traveling  of 
our  data  field.  Theoretical  analysis  shows  [5]  that 
the  POD  eigenfunctions  for  a  travelling  wave 
become  sinusoids.  We  are,  however,  not  inter¬ 
ested  in  the  Fourier  decomposition  of  the  travel¬ 
ing  wave  but  want  to  treat  it  as  one  coherent 


Fig.  2.  Time  series  of  the  maximal  vorticity  over  the  sample.  Our  laminar  phase  comprise  data  in  the  first  ,309f  of  the  sample,  our 
burst  phase  is  centered  around  /  =  5000. 
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Structure.  Hence  we  process  our  data  first,  calcu¬ 
late  the  wa\_  speed  and  go  into  a  co-moving 
coordinate  frame.  The  mean  of  the  so-called 
“untraveled”  data  then  corresponds  to  the 
traveling  structure.  We  subtract  the  mean  and 
perform  our  POD  on  the  remaining  data,  both 
for  the  laminar  and  the  bursting  regime. 

The  technical  details  are  the  following:  Our 
simulation  data  come  from  a  64  x  64  pseudo 
spectral  algorithm  for  the  streamfunction.  Unless 
otherwise  noted  we  use  a  shell  of  Fourier  modes 
incorporating  all  modes  of  the  form  (w,  n)  with 

(shell  10).  The 
POD  is  performed  as  a  snapshot  method  [5]  with 
200  time  frames  on  either  the  physical  data  (a 
64  X  64  vector  calculated  by  inverse  FFT  from 
the  shell  10  Fourier  modes)  or  on  the  Fourier 
coefficients  directly  one  snapshot  vector  now  cor¬ 
responding  to  the  approximately  200  Fourier  am¬ 
plitudes  in  our  shell  10. 


4.  Analyzing  the  streamfunction  data 

We  perform  the  described  POD  analysis  on 
the  laminar  regime  using  the  data  in  ihc  stream 
function  formulation.  Ordering  the  eigenfunc¬ 
tions  according  to  the  magnitude  of  the  eigen¬ 
value  (“energy”)  we  have  98.7%  of  the  energy 
in  the  first  two  modes  which  are  displayed  in  Fig. 
3.  A  reconstruction  of  the  data  using  just  those 
two  modes  together  with  the  mean  flow  gives  an 
almost  perfect  agreement  between  data  and  re¬ 
construction.  Using  the  POD  on  the  Fourier 
amplitudes  we  capture  almost  the  same  amount 
of  energy  in  the  first  few  eigenfunctions.  We 
confirmed  that  the  linear  operations  of  Fourier 
transformation  and  POD  commute,  i.e.,  the 
Fourier  transformation  of  the  first  two  eigenfunc¬ 
tions  in  physical  space  gives  us  the  first  two 
eigenfunctions  of  the  proper  orthogonal  decom¬ 
position,  performed  directly  on  the  Fourier  am¬ 
plitudes.  Fig.  4  shows  the  two  eigenfunctions  in 
Fourier  space.  One  sees  that  although  the  modes 
(0,1)  and  (1,0)  are  very  dominant,  there  are 


l  ig.  3.  First  two  eigenfunctions  for  the  POD  of  the  stream- 
function  data  in  the  laminar  phase,  (a)  with  89. 9%  of  the 
energy,  (b)  with  9.4%  of  the  energy. 

higher  modes  (smaller  scales)  that  also  play  a 
role.  Hence  a  simple  description  of  this  limit 
cycle  in  terms  of  Fourier  modes  does  not  seem  to 
be  feasible. 

Analyzing  the  burst  for  the  streamfunction 
data  shows  some  surprising  results.  We  can  now 
capture  98.0%  of  the  energy  in  three  modes  (fig. 
5). 

The  first  mode  in  the  laminar  regime  is  rough¬ 
ly  the  same  as  the  first  mode  in  the  bursting 
regime,  whereas  modes  two  and  three  in  the 
bursting  regime  both  seem  to  be  related  to  mode 
two  in  the  laminar  regime.  It  appears  that  they 
are  real  and  imaginary  parts  of  the  same  complex 
eigenfunction.  This  suggests  that  if  the  limit  cycle 
lies  in  a  real  subspace,  then  its  unstable  manifold 
is  in  the  imaginary  subspace.  This  is  corrobo¬ 
rated  by  the  observation  that  the  POD  eigen¬ 
functions  two  and  three  in  the  Fourier  repre- 
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Fig.  4.  Two  POD  eigenfunctions  in  Fourier  space  that  span  the  limit  cycle  of  the  laminar  flow  in  streamfunction  formulation.  The 
labelling  on  the  x-axis  is  done  as  follows:  =  0. .  .9.  =  0).  (/c,  =  0. .  .9.  k,  =  1 )  and  so  on  to  k^  =  9  and  then  repeat  for 

Jfe,  =  -1. .  .-9.  A  few  labels  {k^,  <:,)  for  the  real  and  imaginary  parts  of  the  Fourier  modes  are  indicated. 


sentation  (fig.  6)  contain  basically  the  same 
modes  and  hence  span  a  complex  Fourier  mode. 
If  these  subspaces  can  be  shown  to  be  invariant 
under  some  subgroup  of  the  full  symmetry 


Fig.  5.  (a)-(c)  First  three  eigenfunctions  for  the  POD  of  the 
streamfunction  in  the  burst  phase.  Compare  to  the  principal 
eigenfunctions  of  fig.  3.  Energies  are  80.9%,  12.1%  and 
5.5%. 


Fig.  5  (cont.). 
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Fig.  6.  The  first  three  POD  eigenfunctions  for  the  burst  phase  of  the  streamfunction  data  in  Fourier  space.  Compare  to  fig.  5. 
Note  also  that  eigenfunction  two  and  three  seem  to  span  a  complex  eigenfunction. 
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group,  then  this  observation  might  give  us  a  clue 
to  fully  analyze  the  symmetry  operations  that  are 
involved  with  the  heteroclinic  orbits.  However, 
we  have  trouble  reconstructing  the  data.  Com¬ 
paring  reconstruction  and  data  we  see  that  the 
large  scale  flow  is  resolved  well  but  the  charac¬ 
teristic  burst  dynamics  in  the  center  of  the  two 
eddies  is  not  captured  by  the  three  mode  recon¬ 
struction.  It  turns  out  that  a  number  of  modes 
(e.g.  7,  8,  and  11  (see  fig.  7)  with  energies 
around  0.1%)  are  very  localized  on  these  eddies. 
Including  them  in  a  reconstruction  substantially 
improves  it. 


Fig.  7.  Eigenfunction  eight  for  the  streamfunction  data  in  the 
burst  phase.  Energy  is  only  =0.1%.  Note  the  highly  localized 
structure  in  the  lower  right  eddy. 
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5.  Analyzing  the  vorticity  data 

Let  us  consider  the  flow  simulations  in  a  vor¬ 
ticity  representation  w  -  -VV  where  y)  is 
the  streamfunction.  Visual  inspection  of  the  data 
in  this  representation  indicates  sharper  gradients 
in  space  and  time  (i.e.,  changes  in  the  picture 
happen  more  abruptly  and  with  larger  am¬ 
plitude)  but  a  very  similar  overall  dynamics.  We 
note  that  the  POD  eigenvalues  now  physically 
represent  enstrophy,  i.e.,  J  Iwl"  dj;  dy.  It  turns 
out  that,  although  the  visual  impression  of  the 
dynamics  is  not  very  different  from  the  dynamics 
of  the  streamfunction  data,  we  now  need  five 
(eight)  POD  eigenfunctions  to  capture  96% 
(99%)  of  the  energy  of  the  laminar  flow.  Worse 
yet,  we  need  20  POD  eigenfunctions  for  the 
burst  to  capture  95%  of  the  energy.  Also,  typi¬ 
cally  these  eigenfunctions  have  a  very  compli¬ 
cated  Fourier  spectrum,  showing  substantial  con¬ 
tributions  from  the  smaller  scales  (fig.  9). 

It  seems,  however,  that  these  Fourier  spectra 
lead  to  constructive  interference  such  that  the 
corresponding  transformed  functions  in  physical 
space  lead  to  relatively  smooth  structures  (fig. 
8).  Inspecting  the  POD  eigenfunctions  in  the 
laminar  and  bursting  regime  we  find  that  the  first 
two  eigenfunctions  are  roughly  the  same  whereas 
the  third  eigenfunction  (figs.  8,  9)  for  the  burst  is 
a  new  one  which  cannot  be  found  among  the  first 


Fig.  8.  Third  POD  eigenfunction  of  the  vorticity  data  in  the 
burst  phase.  This  eigenfunction  captures  an  energy  of  about 
8.7%  and  scents  to  play  no  role  in  the  dynamics  of  the 
laminar  phase. 
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Fig.  9.  Third  POD  eigenfunction  of  the  vorticity  data  in  the 
burst  phase  in  Fourier  space.  See  fig.  8. 

five  of  the  laminar  regime.  This  indicates,  as  with 
the  streamfunction  data,  that  this  mode  could  be 
responsible  for  the  bursting  dynamics.  Recon¬ 
struction  and  visual  comparison  shows  that  about 
eight  eigenfunctions  in  the  laminar  case  and  15  in 
the  bursting  capture  the  small  scale  dynamics  in 
the  eyes  of  the  eddies  reasonably  well  whereas  a 
smaller  number  of  modes  (around  five  and  ten) 
seem  to  capture  the  large  scale  dynamics.  Due  to 
the  more  pronounced  nature  of  the  small  scale 
activity  in  the  vorticity  data  it  is  more  difficult  to 
separate  small  and  large  scales  visually. 

6.  Conclusion 

Due  to  the  ongoing  nature  of  this  research 
only  few  definite  conclusions  can  be  drawn  but 
several  interesting  comments  and  speculations 
are  in  order. 

-Clearly  the  laminar  phase  is  a  modulated 
traveling  wave.  Its  wavespeed  can  be  calculated 
by  observing  the  change  in  phase  of  a  Fourier 
mode.  In  a  coordinate  frame  moving  with  the 
traveling  part,  the  oscillation  is  described  by  a 
limit  cycle  which  is  spanned  by  the  first  two 
eigenfunctions  of  the  POD  decomposition  for 
the  streamfunction  data.  We  independently  con¬ 
firmed  this  by  extracting  a  scalar  “averaged” 
quantity  from  the  field  (summing  up  a  norm  over 
a  16x  16  grid  in  physical  space).  On  the  time 
series  for  that  scalar  quantity  we  performed  a 
time-delay  embedding  and  calculated  the  fractal 
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dimension  df.  [10].  For  both  streamfunction  and 
vorticity  data  we  found  d^-  1.0. 

-There  exists  a  low  dimensional  large  scale 
dynamics  that  drives  the  burst.  The  unstable 
manifold  of  the  laminar  phase  apart  from  con¬ 
taining  the  T^-torus  is  very  flat,  concentrating  the 
burst  mainly  along  one  dimension.  This  is  again 
supported  by  plots  of  a  time-delay  embedding. 
Fig.  10  shows  such  a  two-dimensional  time-delay 
plot.  One  sees  the  remnants  of  the  basic  limit 
cycle  with  faster  oscillations  superimposed.  Com¬ 
putations  of  fractal  dimensions  suffer  from  too 
small  a  data  set  and  are  not  very  accurate.  Initial 
calculations  give  ^^  =  1.24  and  d^=1.65  for 
streamfunction  and  vorticity  data,  respectively. 
Both  numbers  are  consistent  with  a  POD  embed¬ 
ding  dimension  between  three  and  five. 

-The  biggest  mystery  of  this  analysis  is  the 
large  discrepancy  between  the  number  of  rel¬ 
evant  eigenfunctions  for  streamfunction  and  vor¬ 
ticity  data.  From  a  dynamical  systems  point  of 
view,  if  this  system  has  a  finite  dimensional 
attractor  then  there  exists  only  one  dimension 
and  all  numbers  for  vorticity  and  streamfunction 


Fig.  10.  Two-dimensional  time-delay  plot  for  an  averaged 
scalar  function  of  the  vorticity  held.  One  can  distinguish  a 
laminar  dynamics  following  closely  the  original  limit  cycle 
and  a  faster,  oscillatory  phase  with  motion  transverse  to  the 
limit  cycle. 


should  agree  with  each  other.  However,  the 
Laplacian  obviously  acts  as  a  nonlinear  weight 
function  giving  largest  weight  to  the  smallest 
scales  of  the  PDF  simulation.  Therefore  the 
general  noise  level  of  the  vorticity  data  is  in¬ 
creased  up  to  a  margin  of  about  5%  of  the  POD 
energy.  In  order  to  check  the  dependence  of  the 
POD  results  on  the  resolution  of  the  PDF  simu¬ 
lation  we  enlarged  our  data  set  and  used  all 
Fourier  modes  available  from  the  64  x  64 
pseudo-spectral  grid.  The  result  is  that  we 
needed  even  more  POD  eigenfunctions  (25  ver¬ 
sus  20)  to  capture  95%  of  the  energy  in  the 
vorticity  burst  data.  At  the  same  time  visual 
inspection  of  the  reconstruction  was  satisfactory 
with  fewer  POD  modes  (10  modes  versus  16). 
These  results  suggest  that  there  are  two  types  of 
dynamics  going  on  during  the  burst  phase:  A 
large  scale,  low  dimensional  one  which  can  be 
described  by  a  structurally  stable  homoclinic 
orbit  and  which  has  very  definite  symmetry  prop¬ 
erties.  This  dynamics  is  best  captured  by  looking 
at  the  behavior  of  the  streamfunction.  Riding  on 
top  of  that  dynamics  is  a  small-scale  dynamics 
characterized  by  an  enstrophy  cascade:  Fn- 
strophy  is  accumulating  in  the  modes  which  are 
barely  resolved  and  have  the  smallest  scales. 
Better  spatial  resolution  leads  to  an  accumula¬ 
tion  of  enstrophy  in  the  new,  yet  smaller  scales. 
Such  an  enstrophy  cascade  during  a  burst  was 
already  pinpointed  in  ref.  [1].  While  some  of 
these  scales  which  do  not  belong  to  the  large- 
scale  dynamics  have  some  distinct  visual  impact 
on  the  flow  -  they  drive  the  burst  in  the  eyes  of 
the  eddies  -  we  believe  the  modes  in  the  en¬ 
strophy  cascade  to  be  basically  driven  by  the 
large-scale  dynamics  and  noise.  In  particular, 
they  possibly  play  a  role  in  triggering  the  burst 
event  and  hence  in  the  randomness  of  the  time 
between  bursts,  but  they  do  not  contribute  to  the 
overall  shape  of  the  attractor.  This  view  is  con¬ 
sistent  with  recent  two  fluids  turbulence  theories 
and  with  a  similar  study  of  intermittent  behavior 
in  boundary  layers  triggered  by  midstream  pres¬ 
sure  fluctuations  [17]. 
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Using  an  optimally  convergent  representation,  a  low  dimensional  model  is  constructed,  which  embodies  in  a 
streamwise-invariant  form  the  effects  of  streamwise  structure.  Results  of  Stone  show  that  the  model  is  capable  of 
mimicking  the  stability  change  due  to  favorable  and  unfavorable  pressure  gradients.  Results  of  Aubry  et  al.  suggest  that 
polymer  drag  reduction  is  associated  with  stabilization  of  the  secondary  instabilities,  as  has  been  speculated.  Results  of 
Bloch  and  Marsden  indicate  that  drag  can  be  reduced  by  feedback,  and  that  this  is  mathematically  equivalent  to  polymer 
drag  reduction.  The  authors  showed  that  dynamical  systems  based  on  the  Proper  Orthogonal  Decomposition  have,  on  the 
average,  the  best  short  term  tracking  time  (the  time  that  a  model  tracks  the  true  system  accurately;  essential  for  control) 
for  a  given  number  of  modes.  In  recent  work,  the  authors  have  shown  that  several  assumptions  made  on  an  intuitive  basis 
in  the  work  of  Aubry  et  al.  may  be  justified  formally.  Berkooz  has  made  rigorous  estimates  using  the  proper  orthogonal 
decomposition  showing  that  a  structured  turbulent  flow,  such  as  the  wall  layer,  has  a  phase  space  representation  that 
remains  within  a  thin  slab  centered  on  the  most  energetic  modes  for  most  of  the  time.  Campbell  and  Holmes  have  shown 
several  results  in  connection  with  symmetry  breaking  in  systems  with  structurally  stable  heteroclinic  cycles.  This  work  is 
relevant  to  our  models  of  interacting  coherent  structures  in  boundary  layers  with  discrete  spanwise  symmetry,  such  as  that 
caused  by  riblets,  which  are  known  to  produce  drag  reduction. 


1.  Background 

Objective  analysis  of  experimental  measure¬ 
ments  indicates  that  there  are  recurrent  stream- 
wise  rolls  present  in  the  wall  region,  at  least  in 
the  quadratic  mean  sense  [16,23],  Representa¬ 
tion  theorems  [24]  permit  optimal  expansion  of 
the  instantaneous  velocity  field  in  the  wail  region 
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jointly  funded  by  the  US  Airforce  Office  of  Scientific  Re¬ 
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and  Space  Administration,  Langley  Research  Center,  under 
Contract  No.  NAG-1-954,  and  in  part  by  the  US  National 
Science  Foundation  under  Grants  Nos.  DMS-SS- 14553  and 
MSM  86-11164. 


in  terms  of  these  streamwise  rolls  [25].  Without 
involving  ourselves  in  the  question  of  the  source 
of  these  rolls,  we  ask  how  they  will  behave 
dynamically.  Severely  truncating  our  system,  and 
using  Galerkin  projection,  we  obtain  a  closed  set 
of  non-linear  ordinary  differential  equations  with 
ten  degrees  of  freedom.  The  methods  of  dynami¬ 
cal  systems  theory  are  applied  to  these  equa¬ 
tions.  Loss  to  unresolved  modes  is  represented 
by  a  Heisenberg  parameter  [6], 

We  find  that  for  large  values  of  the  Heisenberg 
parameter  (large  loss),  we  obtain  stable  stream- 
wise  rolls  having  the  experimentally  observed 
spacing.  For  smaller  values  of  the  parameter,  we 
have  traveling  waves  (corresponding  to  cross¬ 
stream  drift  of  the  rolls);  we  also  find  a  hetero- 
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clinic  attracting  orbit  giving  rise  to  intermittency; 
and  finally  a  chaotic  state  showing  ghosts  of  all  of 
the  above. 

The  intermittent  jump  from  one  equilibrium 
point  to  the  other  in  the  heteroclinic  case  resem¬ 
bles  in  many  respects  the  burst  observed  in 
experiments.  Specifically,  the  time  between 
jumps,  and  the  duration  of  the  jumps,  is  approxi¬ 
mately  that  observed  in  a  burst;  the  jump  begins 
with  the  formation  of  a  narrowed  and  intensified 
updraft,  like  the  ejection  phase  of  a  burst,  and  is 
followed  by  a  gentle,  diffuse  downdraft,  like  the 
sweep  phase  of  a  burst.  During  the  jump  a  spike 
of  Reynolds  stress  is  produced,  as  is  observed  in 
a  burst,  although  the  magnitude  is  limited  in  our 
model  by  the  truncation  of  the  high  wavenumber 
components. 

The  behavior  is  quite  robust,  much  of  it  being 
due  to  the  symmetries  present  (Aubry’s  group 
has  examined  dimensions  up  to  64  real  and 
demonstrated  persistence  of  the  global  behavior 
[4]).  We  have  examined  eigenvalues  and  co¬ 
efficients  obtained  from  experiment  [18],  and 
from  exact  simulation  [29],  which  differ  in  mag¬ 
nitude.  Similar  behavior  is  obtained  in  both 
cases;  in  the  latter  case,  the  heteroclinic  orbits 
connect  limit  cycles  instead  of  fixed  points,  cor¬ 
responding  to  cross-stream  waving  of  the  stream- 
wise  rolls.  The  bifurcation  diagram  remains 
structurally  similar,  but  is  somewhat  distorted. 

The  role  of  the  pressure  term  derived  from  the 
permeable  boundary  between  the  wall  region 
and  the  outer  layer  is  made  clear  -  it  triggers  the 
intermittent  jumps,  which  otherwise  would  occur 
at  longer  and  longer  intervals,  as  the  system 
trajectory  is  attracted  closer  and  closer  to  the 
heteroclinic  cycle.  The  pressure  term  results  in 
the  jumps  occurring  at  essentially  random  times, 
and  the  magnitude  of  the  signal  determines  the 
average  timing.  This  clarifies  the  question  of 
whether  bursting  scales  with  wall  variables  or 
with  outer  variables  -  evidently  the  structure  of  a 
burst  scales  with  wall  variables,  while  the  time 
between  bursts  should  scale  in  a  complex  way 
with  both  inner  and  outer  variables  [30,31]. 


Change  of  the  third  order  coefficients,  corre¬ 
sponding  to  acceleration  or  deceleration  of  the 
mean  flow,  changes  the  heteroclinic  cycles  from 
attracting  to  repelling,  increase  or  decreasing  the 
stability,  in  agreement  with  observations  [30]. 

The  existence  of  fixed  point  is  an  artifact 
introduced  by  the  projection;  however,  a  decou¬ 
pled  model  still  displays  the  rich  dynamics 
[8, 12,  20]. 

2.  Recent  work 

In  recent  work,  Berkooz  [7,  8],  in  collabora¬ 
tion  with  Holmes  and  Lumley,  has  shown  that 
several  ascumptions  made  on  an  intuitive  basis  in 
the  work  of  Aubry  et  al.  may  be  justified  formal¬ 
ly,  namely:  that  the  Heisenberg  model  used  gives 
the  correct  dissipation  within  a  constant  of  order 
unity,  as  assumed;  that  the  Leonard  stresses 
(stresses  coming  from  resolved  modes-unre- 
solved  modes  interaction)  may  be  neglected  in 
the  case  of  modeling  with  no  streamwise  vari¬ 
ation,  as  assumed;  that  the  previous  result  holds 
for  an  arbitrary  number  of  eigenfunctions  when 
no  streamwise  variation  is  present;  that  models 
with  no  streamwise  variation  in  effect  average  the 
streamwise  dynamics,  as  conjectured  by  Holmes. 

In  connection  with  our  attempts  to  control  the 
wall  region,  Berkooz  [8]  introduced  the  notion  of 
short  term  tracking  T^.  Consider  the  wall  layer 
starting  from  an  initial  condition.  The  model 
starts  from  an  initial  condition  which  is  the  pro¬ 
jection  of  the  initial  condition  for  the  full  system. 
If  one  fixes  a  distance  in  phase  space,  5,  one 
expects  that  the  dynamics  of  the  model  and  those 
of  the  projection  of  the  full  system  will  diverge. 
The  time  it  takes  this  divergence  to  get  to  size  e 
(on  the  average)  is  and  measured  in  wall 
units.  This  is  a  measure  of  the  time  over  which  a 
dynamical  systems  model  tracks  the  true 
dynamics  accurately.  is  of  fundamental  impor¬ 
tance  in  the  control  application,  and  it  must  be 
of  the  order  of  the  wall-region  time  scales  to 
make  control  possible.  Berkooz  [8]  then  showed 
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that  dynamical  systems  based  on  the  proper 
orthogonal  decomposition  have,  on  the  average, 
the  best  for  a  given  number  of  modes. 

Berkooz  [7]  has  made  rigorous  estimates  using 
the  proper  orthogonal  decomposition  showing 
that  a  structured  turbulent  flow,  such  as  the  wall 
layer,  has  a  phase  space  representation  that  re¬ 
mains  within  a  thin  slab  centered  on  the  most 
energetic  modes  for  most  of  the  time.  This  result 
appears  in  a  seminal  form  in  Foias  et  al.  [17]  and 
was  shown  independently  by  Aubry  et  al.  [5]. 
However,  exits  from  this  region,  which  is  all  that 
our  low-dimensional  models  include,  should  not 
be  ignored,  since  they  typically  correspond  to 
violent  events,  such  as  the  bursting  phenom¬ 
enon.  Berkooz  and  Holmes  are  trying  to  develop 
a  theory  in  which  deterministic,  low-dimensional 
dynamics  governing  the  low  modes  apply  most  of 
the  time,  passages  from  and  returns  to  this  being 
modeled  probabilistically.  This  might  be  viewed 
as  a  dynamical  closure.  They  plan  to  test  their 
theory  on  problems  including  the  32  and  54 
dimensional  projections  of  Aubry  and  Sanghi. 

Campbell  and  Holmes  [15]  are  continuing 
their  studies  of  breaking  translation /reflection 
symmetry  into  discrete  translation  and  reflection, 
specifically  dihedral,  (0(2)-»D4)  in  systems 
with  structurally  stable  heteroclinic  cycles.  They 
have  proved  that  no  analytic  (second)  integral  of 
motion  exists  in  a  certain  limiting  case  and  that 
only  two  pairs  of  the  continuum  of  0(2)  symmet¬ 
ric  heteroclinic  cycles  persist  in  general.  They  are 
studying  the  bifurcations  from  these  survivors. 
This  work  is  relevant  to  our  models  of  interact¬ 
ing  coherent  structures  in  boundary  layers  with 
discrete  spanwise  symmetry,  such  as  that  caused 
by  riblets.  This  is  to  our  knowledge  the  first 
analytical  contribution  to  our  understanding  of 
the  drag  reduction  caused  by  riblets. 

3.  Control 

Polymer  drag  reduction  is  known  to  increase 
the  time  between  bursts,  and  to  increase  the 


scale  of  the  wall  region,  leaving  it  geometrically 
the  same  [27].  In  our  model,  stretching  of  the 
wall  region  to  increase  its  scale  reduces  the  drag, 
and  requires  stabilization  of  the  bursts,  to  in¬ 
crease  the  time  between  bursts  [5].  It  is  thus 
completely  consistent  with  the  observations.  Our 
sort  of  relatively  simple  model  could  be  used  as  a 
“black  box"  in  active  feed-back  systems  to  con¬ 
trol  the  boundary  layer.  Bloch  and  Marsden  [14] 
showed  that  feeding  back  eigenfunctions  with  the 
proper  phase  (in  the  presence  of  noise)  can  delay 
the  bursting,  (the  heteroclinic  jump  to  the  other 
fixed  point),  thereby  decreasing  the  average 
drag.  It  is  also  possible  to  speed  up  the  bursting, 
thus  increasing  mixing,  to  control  separation 
Bloch  and  Marsden  [5]  also  showed  that  polymer 
drag  reduction  is  formally  equivalent  to  delaying 
the  bursts  by  active  feedback  (increasing  the 
mean  time  between  bursts.  Hence,  we  may  ex¬ 
pect  an  actively  controlled  layer  to  resemble  a 
passively  polymer-drag-reduced  layer. 

The  simplest  system  that  contains  the  essential 
features  of  our  low-dimensional  model  is  the 
0(2)-symmetric  1 ;  2  wavenumber  interaction 
studied  by  Armbruster  et  al.  [1].  This  is  the 
simplest  system  that  shares  the  essential  prop¬ 
erties  of  the  models  developed  by  Aubry  et  al. 
[6].  This  can  be  written  in  complex  modal  coor¬ 
dinates  (spanwise  Fourier  amplitudes)  as 

+  £^i(0  - 

(1) 

^1^2  ^221^2!”) 

+  £^2(0  . 

where  the  ^y(f)  are  complex  valued  IID  Wiener 
processes  representing  the  pressure  perturbation 
from  the  outer  part  of  the  boundary  layer.  With 
suitable  choices  of  the  real  parameters  e,,,  for 
e  =  0,  this  system  has  attracting  heteroclinic  cy¬ 
cles  which  for  e  >  0  lead  to  bursts  at  irregular 
intervals  with  a  skewed  distribution  of  inter¬ 
event  times  having  an  exponential  tail  remark- 
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ably  similar  to  those  seen  experimentally 
[22,  30], 

It  was  felt  that  the  energy  expenditure  re¬ 
quired  for  conventional  stabilization  of  the  sad¬ 
dle  points  in  heteroclinic  cycles  (and  hence  com¬ 
plete  suppression  of  bursting)  might  be  excessive 
and  that  a  more  modest  strategy  of  delaying 
bursting  should  be  developed.  Consequently,  in 
their  preliminary  study  of  control  strategies  for 
ODEs  such  as  (1)  [14]  explored  this  possibility 
primarily  by  shifting  the  saddle  eigenvalues  to¬ 
ward  the  left-hand  half  of  the  complex  plane. 
This  was  done  in  the  context  of  linear  control  by 
adding  a  term  Ba=  -)3  diag{/i.fa,,  to  (1)> 
which  shifts  the  unstable  eigenvalue  A„  leftward 
by  0()8).  In  physically  relevant  parameter  re¬ 
gimes  the  system  (1)  has  only  one  unstable 
eigenvalue.  Since  the  mean  interburst  time  is 
(7)  =  A”‘  ln(l/e)  and  the  distribution  tail  goes 
as  exp(-A„r)  [31,32],  any  such  reduction  in  A„ 
delays  bursting  on  average. 

The  substantial  problem  of  designing  and  im¬ 
plementing  a  feedback  law  to  effect  this  has  not 
yet  been  addressed.  The  general  idea  is  to  con¬ 
struct  a  “smart”  wall,  with  sensors  and  actuators 
(piezoelectric  or  magnetoelastic)  to  effect  local 
geometry  changes  at  the  wall  in  response  to 
sensed  velocity  fields,  hence  modifying  the  co¬ 
herent  structures.  This,  in  turn,  changes  the 
coefficients  of  the  model.  We  do  not  yet  know 
specific  details  (direct  numerical  simulations  are 
being  undertaken  to  investigate  them),  but  it  is 
clear  that,  in  general,  all  the  coefficients  and 
ftj,  nonlinear  as  well  as  linear,  will  be  affected, 
and  therefore  we  are  really  in  the  realm  of 
nonlinear  control  rather  than  the  linear  con¬ 
troller  assumed  by  Bloch  and  Marsden  [14). 

A  second  complication  of  the  simple  strategy 
proposed  by  Bloch  and  Marsden  [14]  is  that 
models  such  as  that  above,  and  the  multi-mode 
elaborations  of  Aubry  et  al.  [6],  are  local,  since 
they  are  produced  by  projection  of  the  governing 
equations  on  a  small  domain.  The  modeling 
strategy  relies  on  the  fact  that  this  domain  is 
large  enough  to  contain  a  range  of  structures,  yet 


small  enough  that  bursts  do  not  occur  in  it  at  all 
times,  and  hence  one  does  not  average  away  the 
dynamics  [8,  20].  In  common  with  the  strategy  of 
numerical  simulations,  one  assumes  that  a  two- 
dimensional  array  of  such  periodic  domains  cov¬ 
ering  a  large  surface  will  adequately  reproduce 
the  statistics  of  an  extended  turbulent  layer.  In 
reality,  exactly  periodic  arrays  of  structures  all 
bursting  in  step  are  not  seen  and  phase  relation¬ 
ships  between  distant  spacial  points  may  not  be 
simple.  Thus,  to  adequately  model  a  spacially 
extended  boundary  layer,  one  should  probably 
take  an  array  of  systems  such  as  the  above, 
weakly  coupled  and  excited  by  weakly  correlated 
“pressures”  As  far  as  we  know,  the  prob¬ 

lem  of  controlling  such  a  distributed  nonlinear 
system,  with  stochastic  excitation  but  strong  un¬ 
derlying  symmetry  and  phase  space  structure  of 
this  individual  units,  has  not  yet  been  addressed. 

We  are  currently  studying  the  construction  and 
properties  of  extended  models  using  projection 
onto  spacially  localized  wavelet  bases  [9].  The 
Kuramoto-Sivashinsky  equation,  more  tractable 
than  the  full  boundary  layer  problem  but  sharing 
features  such  as  symmetries  and  heteroclinic  cy¬ 
cles  [2],  is  being  used  in  this  work.  Once  these 
systems  are  better  understood,  it  will  be  possible 
to  pose  the  control  problem  more  clearly. 
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This  work  is  devoted  to  the  experimental  analysis  of  the  flow  created  by  a  row  of  cylinders.  As  each  cylinder  wake 
appears  by  the  Hopf  bifurcation  of  a  global  oscillating  mode,  this  experiment  is  similar  to  a  chain  of  coupled  non-linear 
oscillators  and  this  dynamic  system  can  be  modelled  using  a  Ginzburg-Landau  type  equation.  A  video  analysis  technique  is 
performed  on  visualization  pictures  in  order  to  obtain  the  spatio-temp>oral  evolution  of  the  wakes.  This  diagram  is  then 
analysed  using  the  bi-orthogonal  decomposition  extended  to  its  complex  form.  It  is  shown  that  only  a  few  eigenmodes  are 
excited,  giving  a  low-dimensional  dynamic  system  that  enables  the  different  parameters  of  the  model  to  be  estimated. 


1.  Introduction 

The  goal  of  our  experimental  analysis  is  the 
study  of  the  flow  consisting  of  a  row  of  coupled 
wakes  [1],  The  so-called  Benard-Von  Karman 
street  appearing  in  the  wake  of  a  cylinder  by  the 
Hopf  bifurcation  of  a  global  mode  [2],  consti¬ 
tutes  one  of  the  most  simple  non-linear  oscil¬ 
lators  of  fluid  mechanics.  If  we  consider  a  row  of 
N  parallel  cylinders  placed  in  a  plane  perpen¬ 
dicular  to  a  flow,  the  row  of  wakes  is  similar  to  a 
chain  of  non-linear  coupled  oscillators,  and  can 
be  modelled  using  the  system  of  N  equations  [3]: 

=(/■<«>  +  e(l  +  jCo))>l,(0 

-(l-F  jC2)|/l, (01^-4, (0+  Kl  +  jc,) 

+  A^_^it)-2AXt)), 

where  A/  represents  the  complex  order  parame¬ 
ter  associated  with  the  spanwise  motions  in  the 
wake  of  the  I'th  cylinder,  e  the  control  parameter 


of  the  flow  (i.e  the  reduced  Reynolds  number), 
(o  +  ec„  the  natural  frequency  at  the  correspond¬ 
ing  Reynolds  number,  and  Cj  is  a  parameter 
representing  the  variations  of  this  frequency  with 
non-linearity.  The  coefficient  1^(1  +  jc,)  gives  the 
complex  intensity  of  the  coupling,  where  an  in¬ 
teraction  to  first  neighbours  only  has  been  con¬ 
sidered.  This  equation  can  be  seen  as  a  discrete 
form  of  the  generalized  Ginzburg-Landau  equa¬ 
tion  (GGL  equation),  which  represents  the 
spatio-temporal  evolution  of  a  system  of  waves 
propagating  in  unstable  flows  [4].  Recently, 
some  erratic  spatio-temporal  behaviour  has  been 
observed  in  numerical  simulation  of  the  GGL 
equation  [5]  and  also  in  some  experiments  [3,  6]. 
We  can,  therefore,  assume  that  the  spatial  aspect 
of  turbulence,  absent  in  the  temporal  theories  of 
transition  towards  deterministic  chaos,  might  be 
recovered  in  the  notion  of  spatio-temporal  chaos 
[7].  But  the  different  aspects  of  the  solutions  of 
the  GGL  equation  (laminar  or  turbulent)  de¬ 
pend  on  the  values  of  its  coefficients.  For  in¬ 
stance,  if  1  +  c,C2  <  0,  the  system  is  unstable  and 
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phase  turbulence  appears  [7],  The  challenge  of 
our  experimental  analysis  is,  therefore,  to  an¬ 
swer  the  following  questions;  Is  the  GGL  equa¬ 
tion  a  good  model  for  our  experiment  of  coupled 
wakes?  If  so,  are  we  able  to  calculate  its  co¬ 
efficients? 

In  order  to  study  the  spatio-temporal  evolu¬ 
tion  of  these  coupled  wakes,  we  apply  the  bi- 
orthogonal  decomposition  [8]  to  experimental 
signals  resulting  from  a  video  image  analysis.  As 
the  GGL  model  is  written  in  the  complex  plane, 
we  first  build  an  analytical  signal  using  the  Hil¬ 
bert  transform  [9],  and  secondly  apply  the  bi- 
orthogonal  decomposition  in  its  complex  form  to 
determine  the  eigenmodes  of  our  system.  The 
projection  of  the  GGL  equation  onto  these 
modes  leads  to  a  tensorial  equation  which 
generalizes  the  notion  of  dispersion  relation.  The 
parameters  of  this  relation  can  be  calculated  by 
resolving  a  system  of  algebraic  equations.  Thus 
we  obtain  the  values  of  the  coefficients  of  the 
GGL  equation  corresponding  to  our  experimen¬ 
tal  system.  A  numerical  simulation  of  the  model 
is  then  realised  in  order  to  compare  the  original 
flow  with  the  numerical  solution. 


2.  Experimental  results 

A  row  of  16  cylinders  is  placed  in  the 
200  mm  x  200  mm  working  section  of  a  hydro- 
dynamic  water  tunnel.  Each  cylinder  is  2  mm  in 
diameter  and  its  length  is  equal  to  tne  side  of  the 
cross  section  of  the  channel.  The  distance  be¬ 
tween  each  cylinder  axis  is  set  to  a  value  of  8  mm 
for  which  some  strong  coupling  between  the 
vortex  streets  exist  [10].  Through  a  small  aper¬ 
ture  (diameter  0.5  mm)  in  the  middle  of  each 
cylinder,  we  inject  some  dye  (an  emulsion  of 
silicon  oil)  in  order  to  visualize  the  flow,  'fhe 
flow  velocity  is  very  low  (around  0.04  m/s)  and 
this  technique  is  suitable  for  visualizing  the 
waves  that  occur  on  the  streak  lines  behind  each 
cylinder.  Fig.  1  presents  a  snap-shot  of  such  a 
visualization  for  a  Reynolds  number  (based  on 


Fig.  1.  Snapshot  of  the  flow  around  the  16  cylinders  for  a 
Reynolds  number  of  80. 

the  cylinder  diameter)  of  80.  This  low  value  has 
been  chosen  in  order  to  avoid  three-dimensional 
effects  that  could  take  place  along  the  cylinder 
axes  [11].  A  successful  verification  of  the  bi¬ 
dimensionality  of  the  flow  allows  us  to  consider 
the  spatio-temporal  deformations  of  the  dye  fila¬ 
ments  as  the  order  parameters  describing  each 
wake.  We  observe  in  fig.  1  that  wakes  4  and  11 
have  stopped  oscillating.  These  events  appear  in 
a  completely  erratic  manner  and  can  affect  any 
wake.  We  believe  that  these  amplitude  holes  are 
similar  to  some  defects  discovered  recently  in  a 
theoretical  analysis  [12]  or  in  numerical  simula¬ 
tions  [5]  of  the  GGL  equation.  As  we  have  no 
three-dimensional  effects,  these  holes  are  differ¬ 
ent  from  some  amplitude  modulations  that  may 
appear  in  the  wake  of  one  cylinder  [11].  The 
chaotic  appearance  of  these  holes  seems  to  be 
the  basic  ingredient  of  the  spatio-temporal  chaos 
observed  in  these  different  situations  and  might 
be  a  universal  process  of  turbulence  genesis. 

To  achieve  a  better  understanding  of  our 
dynamic  system,  we  build  spatio-temporal  dia¬ 
grams  using  the  video  movie  of  our  visualization. 
We  realise  real-time  acquisitions  of  one  video 
line  placed  at  a  distance  of  12  mm  behind  the 


M.  P.  Chauve,  P.  Le  Gal  /  Bi-orthogonal  decomposition  of  coupled  wakes 


Fig.  2.  A  video  line  showing  the  16  peaks  corresponding  to 
the  16  wakes  merging  from  the  noisy  background. 


row  of  cylinders.  Fig.  2  shows  such  a  line  at  a 
given  time.  It  has  been  digitized  on  512  pixels 
and  consists  essentially  of  a  row  of  peaks  merg¬ 
ing  from  a  constant  level.  This  low  level  is  simply 
the  dark  background  of  the  picture  and  each  of 
the  16  peaks  is  the  intersection  of  the  video  line 
with  the  white  dye  streak.  The  position  of  the 
maximum  value  of  each  peak  is  easily  obtained 
and  constitutes  parameter  Ai  associated  with  the 
jth  cylinder.  Fig.  3  shows  a  spatio-temporal  dia¬ 
gram  composed  of  256  lines  which  have  been 
stored  every  40  ms.  Several  holes  are  clearly 
visible  and  illustrate  the  complexity  of  the  flow. 
We  observe  in  this  diagram  that  a  difference  of 
about  15%  can  exist  between  the  frequencies  of 
the  coupled  oscillators,  thus  the  gradients  of 
phase  seem  to  be  at  the  origin  of  the  formation 
of  the  wake  oscillation  intermittent  deaths  [5,  1]. 


0 _ ; _ 

1 


WWWWWVWVWS<% 


Fig.  3.  Spatio-temporal  diagram.  Some  amplitud.'  holes  are 
clearly  visible. 
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To  study  this  diagram  i'  a  quantitative  manner, 
we  chose  to  apply  the  ;omplex  bi-orthogonal 
decomposition  to  this  field.  It  should  be  noted 
that  a  great  advantage  of  this  technique  is  to 
catch  and  synthesize  the  essential  spatio-tempo¬ 
ral  features  of  our  flow  in  a  set  of  a  few  eigen- 
modes. 


3.  Bi-orthogonal  decomposition 

3.1.  Analysis  tools 

Bi-orthogonal  decomposition  is  an  extension 
in  space  and  time  of  the  proper  orthogonal  de¬ 
composition  (or  the  Kahrunen-Loeve  decompo¬ 
sition)  proposed  by  Lumley  [13]  for  the  identifie- 
cation  of  coherent  structures  in  turbulence.  Bi- 
orthogonal  decomposition  has  only  recently  been 
introduced  [8]  and  consists  of  a  decomposition 
into  spatial  orthogonal  modes,  called  topos,  and 
temporal  orthogonal  modes  called  chronos.  It 
has  been  shown  [8]  that  a  canonical  decomposi¬ 
tion  exists  for  any  complex  space-time  signals 
(providing  they  are  finite  energy  functions  of  x 
and  t)  which  can  be  written  as: 

0  =  2  a*<A*(0  >pA^)  , 

k 

where  the  overbar  denotes  the  complex  conju¬ 
gate,  are  the  square  roots  of  the  eigenvalues 
of  a  spatial  (or  temporal)  two-point  corre  ation 
function  and  (or  </'*(0)  are  the  correspond¬ 
ing  spatial  (or  temporal)  eigenfunctions.  These 
two  sets  of  functions  are,  in  fact,  composed  of 
orthonormal  functions.  It  follows  that  an  analysis 
similar  to  that  given  by  the  proper  orthogonal 
decomposition  can  be  performed.  It  is,  however, 
interesting  to  mention  here  that  the  notion  of 
coherent  structures  is  extended  to  the  time  direc¬ 
tion:  each  space  structure  is  associated  with  a 
temporal  structure:  a  coherent  structure  is  then 
defined  by  a  couple  of  chrono  and  topo.  The 
energy  of  a  structure  is  simply  given  by  its  eigen- 
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value  and  the  global  energy  of  the  signal  is  equal 
to  the  sum  of  the  eigenvalues: 

E{u)  =  ^al  . 

Ac 

The  effective  number  of  degrees  of  freedom  of 
the  signal  or  its  dimension  is  given  by  the  num¬ 
ber  of  non-zero  eigenvalues. 

Up  to  now,  this  decomposition  has  only  been 
used  in  its  restrictive  real  form.  As  the  GGL 
model  uses  a  complex  order  parameter  A;(r),  it 
is  necessary  to  extend  our  measurements  to  the 
complex  plane.  This  means  that  each  real  signal 
is  analytically  extended  by  the  calculation  of  an 
imaginary  part  by  the  use  of  the  Hilbert  trans¬ 
form  [9].  This  transformation  is  just  an  amplifier 
which  has  a  gain  of  1  and  that  shifts  the  phase  of 
90°.  By  this  means  the  bi-orthogonal  decomposi¬ 
tion  can  be  used  in  its  complex  form  and  takes 
into  account  the  phase  dynamics  of  the  signals 
(exactly  like  the  sine  and  cosine  functions,  both 
are  necessary  to  represent  phase  dynamics  in  the 
Fourier  space). 

3.2.  Experimental  application 

The  spatio-temporal  diagram  of  hg.  3  is  sup¬ 
posed  to  be  the  real  part  of  a  complex  held.  By 
using  the  Hilbert  transform  we  obtain  the  corre¬ 
sponding  imaginary  part.  This  complex  held  is 
represented  by  a  matrix  of  16  spatial  points  by 
256  temporal  ones.  As  this  matrix  is  not  a  square 
matrix,  we  chose  to  calculate  a  spatial  correla¬ 
tion  matrix  which  is  a  tractable  16  x  16  Hermi- 
tian  matrix.  In  order  to  compute  the  eigenvalues 
of  this  matrix,  we  hrst  convert  it  into  a  32  x  32 
real  symmetric  one.  Then  a  Householder  reduc¬ 
tion  to  a  tridiagonal  form  followed  by  a  QL 
iteration  [14]  is  used  to  obtain  the  double  real 
eigenvalues  aj,  and  the  corresponding  complex 
spatial  eigenfunctions  The  chronos  are 

obtained  by  a  scalar  product  of  the  original 
signals  by  the  topos. 

Fig.  4a  shows  the  decrease  oi  p^  =  a\lE  ver- 
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k 

Fig.  4.  (a)  Evolution  of  the  normalized  energy  p,,  versus  the 
number  k  of  the  mode,  (b)  Cumulated  energy  showing  that 
only  three  modes  contain  90%  of  the  total  energy. 

sus  k,  which  are  the  normalized  energies  of  each 
mode  k:  it  will  be  shown  that  the  energy  is 
concentrated  in  the  first  modes.  The  accumu¬ 
lated  energy  is  presented  in  fig.  4b  where  we 
observe  that  90%  of  the  global  signal  energy  is 
caught  only  in  the  first  three  modes,  proving  that 
the  dynamics  of  our  system  are  contained  in  a 
low-dimension  space. 

Fig.  5  gives  the  evolution  of  the  six  first  com¬ 
plex  chronos  (fig.  5a:  real  part,  fig.  5b:  imaginary 
part).  It  can  be  seen  that  the  first  chronos  consist 
of  a  nearly  homogeneous  rapid  oscillation  which 
characterizes  the  basic  Von  Karman  wake.  The 
number  of  periods  present  in  these  functions 
allows  an  estimation  of  the  basic  frequency  of 
the  wakes.  The  value  obtained  is  30%  lower  than 
the  classical  value  of  the  Von  Karman  street  [11]. 
It  has  been  verified  that  the  wakes  are  bi-dimen- 
sional  [1]  and  this  implies  that  this  difference 
cannot  be  a  consequence  of  an  oblique  shedding 
effect  [11]  and  is  probably  due  to  the  wake 
coupling.  However,  this  rapid  oscillation  is  mod¬ 
ulated  in  amplitude  with  a  very  long  period.  For 
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Fig.  5.  The  real  (a)  and  imaginary  (b)  parts  of  the  6  first 
chronos. 


the  two  first  chronos,  this  period  covers  the 
entire  temporal  window.  We  would  point  out 
that  there  is  a  shift  of  <17  between  the  modulations 
of  chronos  1  and  2.  Chronos  3  and  4  present  the 
same  features,  but  now  the  modulations  appear 
twice  in  the  window  and  there  is  still  a  phase  of 
IT  between  both  modulations.  The  fifth  chrono  is 
also  constituted  by  the  rapid  mode  which  is 
modulated  in  amplitude  with  the  third  harmonic. 
Therefore,  an  entire  hierarchy  of  modulations  is 
captured  by  this  set  of  chronos.  We  think  that 
the  dynamics  of  the  wake  deaths  are  included  in 
these  phase  variations  of  the  modulations.  In  the 
range  6-16,  the  chronos  carry  a  small  amount  of 
energy  and  are  only  linked  to  the  noise. 

Fig.  6  is  a  reconstruction  of  the  real  part  of  the 
signal  on  only  the  first  three  modes; 

=  S  a*4(0  <P*(0  »  *  =  1,3. 

k 
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Fig.  6.  Real  part  of  the  calculated  signal  on  the  three  first 
modes. 

We  can  verify  that  this  spatio-temporal  dia¬ 
gram  agrees  very  well  with  the  original  measured 
one  in  fig.  3.  Up  to  now,  we  have  not  been  able 
to  see  any  remarkable  features  on  the  topos. 


4.  Comparison  with  the  GGL  model 

Having  calculated  the  decomposition  of  the 
complex  field  i4,(f)  on  the  two  sets  of  orthonor¬ 
mal  functions: 

k 

we  project  the  function  |i4,(f)|Vl,(f),  which  ap¬ 
pears  in  the  GGL  equation  onto  the  same  set  of 
chronos  and  topos: 

\AM\"AM  =  lMit)<Pkii). 

/* 

By  replacing  these  two  functions  with  their  ex¬ 
pressions  in  the  GGL  equation,  we  obtain: 

2  a*  <P*(0  =  [j"  +  «(1  +  j^^o)]  S  ot^MO 

*  at  k 

X  MO  -  (1  +  2  Pjk^jit)  MO 

Jf( 

+  Ki  +  jc,)  S  ockMOlMi  + 1) 

k 

+  l)-2<Pt(0]  • 
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Using  the  properties  of  orthonormality  of  the 
topos  and  chronos,  we  project  this  equation  first 
onto  ip,\ 

a,  =  [jo  +  e(l  +  jCo)]a,iA,(0 

-  (1  +  jCz)  S 

j 

+  +  jCi)E  +  1) 

k 

+  (iP*(i-  l)-2<P;t(/))),.  , 

where  ( / ),  represents  the  scalar  product  of 
topos.  Then,  in  the  same  manner,  we  project 
again  this  last  equation  onto  tj//. 

+  e(l  +  jCo))a,5,, 

-(l  +  jcj)^,,  +  p(l  +  ]c^)a^{<p,(i) /{(fXi  +  1) 

+  <p,(i  -  i)  -  2<pXi))) i , 

where  {/),  is  now  the  scalar  product  between 
two  chronos,  and  5^,  the  Kronecker  symbol.  We 
note  the  term  {{d^,{t)ldt)l*fi,),  and  K^,  the 
term  +  1)  + -PjO' “  1) “ 2<p,(0)>,  in 

order  to  write  the  GGL  equation  in  the  following 
condensed  form: 

=  (j"  +  e(l  +  -  (1  +  ']C2)^sl 

+  v{\  +  \c^)a,K,,. 

The  quantities  j3,,  and  K,i  can  be  calcu¬ 
lated  from  the  eigenfunctions  obtained  previous¬ 
ly.  If  we  call  Cq  the  quantity  (/<u  +  £(1  +  /Cq)), 
C,  the  quantity  i/(l  +  jc,),  Cj  the  quantity  (1  + 
jcj)  and  7,,  the  ratio  /3,,/a,,  the  last  equation  can 
be  written  in  the  following  way; 

^j/  ~  ^O^sl  ~  ^2%/  Cl(“j/«/)^,/ 

This  relation  is  an  extension  of  the  dispersion 
relation  of  the  Fourier  space.  The  anti-hermitian 
tensor  12,,  is  a  generalization  of  the  notion  of 
frequency  and  the  Hermitian  tensor  /f,,  a 
generalization  of  the  notion  of  wavenumber. 


Fig.  7.  Real  part  of  a  siinulated  field  using  the  calculated 
coefficients  from  the  experimental  analysis  (arbitrary  units). 

Therefore,  the  problem  we  are  dealing  with  is 
obtaining  the  equation  of  the  plane  representing 
this  dispersion  relation  in  the  space  (y,  fi,  K). 
From  the  previously  performed  analysis,  three 
different  triplets  (y„,  flu,  K„)  can  be  calculated 
from  the  first  three  most  energetic  modes.  The 
resolution  of  three  linear  equations  leads  to  the 
three  parameters  Cq,  C,  and  C2  and  finally  to  an 
estimation  of  the  three  numerical  coefficients  of 
the  GGL  equation.  With  these  coefficients,  it  is 
possible  to  come  back  to  the  model  and  perform 
a  numerical  simulation.  An  example  of  such  a 
result  is  presented  in  fig.  7  where  some  holes  of 
amplitude  can  be  observed  as  in  the  original 
signal  (see  fig.  3). 

5.  Conclusion 

As  in  numerical  simulations  of  the  generalized 
Ginzburg-Landau  equation,  spatio-temporal 
chaos  has  been  observed  in  an  experiment  of 
coupled  wakes.  This  chaos  is  apparent  in  the 
erratic  formation  of  amplitude  extinctions  of  the 
Benard-Von  Karman  streets.  This  dynamic  sys¬ 
tem  has  been  analysed  by  the  use  of  a  video 
imaging  system  and  analytically  extended  to  the 
complex  plane  using  the  Hilbert  transform.  In 
order  to  extract  the  most  energetic  space-time 
structures  and  use  them  as  a  basic  set  of  func¬ 
tions  to  project  the  GGL  model,  a  complex 
bi-orthogonal  decomposition  has  been  per- 
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formed.  It  has  been  shown  that  the  dynamics  of 
the  system  studied  is  included  in  a  three-dimen¬ 
sional  proper  space.  The  concerned  eigenmodes 
present  long  wave  modulations  which  contain  the 
dynamics  of  amplitude  holes.  The  projection  of 
the  GGL  equation  onto  the  first  three  eigen¬ 
modes  leads  to  a  generalized  dispersion  relation 
whose  solution  produces  an  estimation  of  the 
GGL  coefficients.  A  numerical  simulation  with 
these  last  coefficients  shows  behaviour  similar  to 
the  experimental  system.  Thus  we  confirm  that 
the  GGL  equation  is  a  good  model  for  our 
experimental  coupled  wakes.  In  particular,  the 
chaotic  appearance  of  these  holes  (experimental 
or  numerical)  seems  to  be  a  universal  process  of 
turbulence  genesis  requiring  further  investi¬ 
gation. 
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Various  changes  in  measured  signals  were  observed  in  thermal  convective  experiments  of  low-temperature  helium  gas. 
indicating  that  there  are  more  than  one  turbulent  states  in  the  system.  The  more  recently  observed  transition  occurs  at  a 
Rayleigh  number  of  about  10",  within  the  hard  turbulence  regime.  It  was  first  observed  [Wu  et  al..  Phys.  Rev.  Lett.  64 
(1990)  2140]  in  the  analyses  of  the  frequency  power  spectra,  measured  at  the  center  of  the  experimental  cell.  In  this  paper, 
it  is  shown  that  the  power  spectra  can  be  equally  well  fitted  by  three  different  fitting  functions,  each  having  a  different 
physical  interpretation.  Thus,  the  transition  is  not  well  characterized  by  the  [>ower  spectra.  On  the  other  hand,  the  study  of 
root-mean-squared  temperature  derivatives  clearly  indicates  and  quantifies  this  transition. 


1.  Introduction 

Recently,  there  has  been  a  lot  of  interest  in 
studying  the  Rayleigh-Benard  convection  of 
low-temperature  helium  gas  [1-6],  The  advan¬ 
tage  of  using  low-temperature  helium  gas  is  that 
a  wide  range  of  Rayleigh  numbers  can  be  cov¬ 
ered  by  changing  the  gas  density  [7, 8],  Thus,  the 
system  can  be  used  to  study  the  development  of 
different  flow  states.  Speciflcally,  it  is  ideal  for 
studying  turbulence  which  remains  as  an  essen¬ 
tially  unsolved  problem  in  fluid  mechanics. 

The  experimental  system  consists  [1-3]  of  a 
cylindrical  cell,  of  diameter  20  cm  and  height  (L) 
40  cm,  which  is  fllled  with  helium  gas  at  about 
5  degrees  Kelvin.  The  cell  is  heated  from  below 
and  the  temperature  difference  between  the  top 
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and  the  bottom  plates.  A,  is  of  the  order  of 
100  mK.  Temperature  as  a  function  of  time, 
T{t),  at  various  points  inside  the  cell  is  measured 
by  arsenic-doped  silicon  bolometers.  These 
bolometers  are  small  in  size,  about  0.2  mm,  with 
a  high  sensitivity  of  2  mK//2,  and  short  response 
time  of  the  order  of  1  millisecond. 

The  system  is  described  by  the  Boussinesq 
equations  [9]  of  an  incompressible  fluid: 

du  7 

—  —  vV  u  +  u  'Vu  +  Vp  =  ga  Te^  ,  ( la) 

ot 

fiT 

—  -kV^T  +  u-TT  =  0,  (lb) 

V*«  =  0,  (Ic) 

where  u  is  the  velocity  field,  p  the  pressure 
divided  by  density  and  T  the  temperature  field, 
while  Cj  is  the  unit  vector  in  the  vertical  direc¬ 
tion.  Further,  a  is  the  volume  thermal  expansion 
coefficient,  g  is  the  acceleration  due  to  gravity, 
and  K  and  u  are  the  thermal  diffusivity  and 
kinematic  viscosity  of  the  gas,  respectively.  Eq. 
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(1)  defines  two  dimensionless  parameters:  the 
Rayleigh  number  (Ra)  which  is  given  by 


Ra-^ 

KP 


(2) 


and  the  Prandtl  number  which  is  the  ratio  p/k.  In 
this  helium  experiment,  the  Prandtl  number  is  of 
order  unity  and  is  almost  constant,  thus  a  flow 
state  is  characterized  by  the  Rayleigh  number. 

As  the  Rayleigh  number  is  increased,  the  fluid 
progresses  from  a  laminar  state  to  a  turbulent 
state  which  exhibits  both  temporal  and  spatial 
disorder.  Two  different  turbulent  states  have 
been  observed  which  are  designated  as  the  soft 
and  hard  turbulence  [1-3].  The  soft-to-hard  tran¬ 
sition  occurs  at  Ra  about  10®.  The  main  differ¬ 
ence  between  the  soft  and  the  hard  turbulent 
states  is  that  hard  turbulence  is  a  self-similar 
state  in  which  all  measured  quantities  exhibit 
scaling  behavior.  These  measured  quantities  in¬ 
clude  the  Nusselt  number,  Nu,  which  is  the  ratio 
of  the  actual  heat  flux  to  that  would  occur  were 
there  only  heat  conduction,  namely: 


Nu  = 


H 

kA/L  ’ 


(3) 


where  H  is  the  actual  heat  flux  and  k  is  the 
thermal  conductivity  of  the  gas.  Above  the  soft- 
hard  transition,  Nu  scales  with  Ra: 

Nu~Ra'’^’"^‘’“’^  (4) 


The  scaling  exponent  0.290  ±  0.005  agrees  with 
the  theoretical  result  [3]  of  2/7  which  is  also 
confirmed  in  a  two-dimensional  numerical  simu¬ 
lation  [10].  The  other  quantities  are  the  root- 
mean-squared  fluctuations,  A^,  and  the  large 
scale  flow  velocity,  U,  which  scale  with  Ra  as 
follows: 


characterized  by  other  changes.  First,  the  prob¬ 
ability  distribution  of  temperature  fluctuations, 
recorded  at  the  center  of  the  cell,  changes  from 
Gaussian-like  to  exponential-like  [1-3].  Second, 
there  is  a  change  in  the  shape  of  the  frequency 
power  spectrum  [4]. 

More  recently,  Wu  et  al.  [4]  observed  a  change 
in  the  high-frequency  parts  of  the  power  spectra 
at  Ra  about  10“.  For  Ra:^7.3  x  lO"’,  when  the 
frequency  and  power  are  properly  rescaled,  the 
whole  power  spectra  (except  for  the  initial  flat 
region  with  frequencies  smaller  than  that  corre¬ 
sponding  to  the  large  scale  flow)  can  be  de¬ 
scribed  by  a  universal  function.  However,  this 
does  not  work  for  the  high-frequency  parts  for 
data  of  higher  Rayleigh  numbers.  Instead,  the 
data  for  Ra  s:7.3  x  lO"*  can  be  made  to  “col¬ 
lapse”  into  one  single  curve  only  by  a  more 
complicated  multifractal-like  transformation. 
Frisch  and  Vergassola  [11]  showed  that  just  this 
kind  of  scaling  arises  in  the  Parisi-Frisch  multi¬ 
fractal  model  of  turbulence  [12]. 

This  second  transition  at  Ra  about  10 "  is  only 
a  change  in  the  high-frequency  parts  of  the 
power  spectra.  The  other  measured  characteris¬ 
tics  of  hard  turbulence  remain  unchanged.  The 
scaling  behavior  of  the  various  quantities  men¬ 
tioned  above  continues  to  hold  for  values  of  Ra 
well  beyond  lO”.  This  paper  gives  a  detailed 
analysis  of  the  power  spectra,  particularly  at  the 
higher  frequencies  In  section  2,  we  describe 
these  analyses  and  the  results  obtained.  We  dis¬ 
cuss  the  results  in  section  3  and  show  that  the 
data  can  be  equally  well  fitted  by  three  different 
fitting  functions  each  having  a  physical  interpre¬ 
tation.  Thus,  the  transition  is  not  well  character¬ 
ized  by  the  power  spectra.  We  then  include  the 
results  of  another  analysis  which  clearly  indicates 
and  quantifies  this  transition.  Conclusions  are 
given  in  section  4. 


A  K  2.  Analyses  of  frequency  power  spectra 


Moreover,  the  soft-to-hard  transition  is  also 


The  frequency  power  spectra,  P(a>)  are  ob- 
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tained  by  Fourier  transforming  T{t)  and  taking 
the  absolute  square, 


Piu,)  = 


2  + 

/  =  0 


N/t 


(6) 


with  <M>  being  an  integer  times  1  /(tN)  where  t  is 
the  experimental  sampling  period.  In  the  actual 
calculation,  T{t^f  +  jr)  is  multiplied  by  a  Hanning 
window  (see  e.g.  ref.  [13]).  We  fit  P(w)  of 
sixteen  Rayleigh  numbers,  ranging  from  1.1  x 
10®  to  4.3  X  10‘'‘,  by  three  different  functional 
forms,  each  with  three  fitting  parameters.  In  the 
following,  we  shall  first  describe  these  different 
fitting  forms  and  then  the  results  obtained. 

According  to  ref.  [4],  the  highest  frequency 
part  of  the  spectra  can  be  described  by  exp[-(w/ 
/3=  0.55  ±0.05  and  is  some  fre¬ 
quency  depending  on  Ra.  Therefore,  we  first  fit 
the  data  by  the  following  form, 

=  Aiw”‘‘exp[(-«/<ii,)‘'^] ,  (7) 


where  A,,  x,  and  a>,  are  parameters  fitted  for 
each  Ra.  We  shall  call  this  fit  the  exponential 
one-half  fit.  The  second  functional  form  we  use 
is  similar  to  the  first  one  but  with  an  exponential 
utoit . 

Fjfw)  =  exp(— oi/wj)  *  (8) 


with  A 2,  ±2  ^2  l>eing  the  fitting  parameters. 

Therefore,  we  call  it  the  exponential  fit.  This 
form  conforms  to  standard  ideas  about  power 
spectra  in  turbulence  [14, 15]  and  was  used  by 
Procaccia  et  al.  [6].  It  can  be  shown  (see  appen¬ 
dix  and  ref.  [16])  that  a  multifractal-like  fit  [4] 
with  a  structure  like  that  of  eq.  (8)  can  be  given 
in  the  form 

Fiw)  =  '  (^)  ''  exp[-C2(a,/a,„)V^] , 

with  S  =  ln(Ra//?o) ,  (9) 

where  P„,  /?„  “'o’  ^2  ^2  ^re  constants. 


Thus,  we  choose  the  last  fitting  form  to  be  a 
power-law  decay  of  exponent  -1.3  with  a 
stretched-exponential  cutoff. 

F^ito)  =  A  ,a>  '  ’  exp[-(<a/ta3)^]  .  ( 10) 

with  A,,  o),  and  )3  the  fitting  parameters  and  call 
it  the  stretched-exponential  fit. 

We  perform  a  nonlinear  least-square  fit  to 
obtain  the  parameters.  As  a  preliminary  check, 
we  plot  the  result  of  the  fit  together  with  the  data 
in  the  same  graph.  We  find  that  the  three  fits 
work  almost  equally  well.  In  order  to  see  better 
the  quality  of  these  different  fits,  we  plot  the 
ratio  of  the  data  to  the  result  of  fit,  r,  for  each 
Ra.  Typical  plots  of  r  for  the  three  fits  are  shown 
for  four  Ra  in  fig.  1 .  We  do  not  expect  the  fits  to 
work  either  for  the  initial  flat  region  where  fre¬ 
quencies  are  smaller  than  that  corresponding  to 
the  large  scale  flow,  or  near  the  highest  fre¬ 
quency  end  where  thermal  noise  levels  off  the 
data.  This  is  reflected  in  fig.  1  in  which  r  is  much 
smaller  than  1  for  the  initial  flat  region  and  much 
larger  than  1  for  the  highest  frequencies.  We 
observe  that  there  is  an  intermediate  region  in 
which  r  is  around  1,  showing  that  the  different 
forms  fit  the  data  approximately. 

The  exponential  one-half  fit  works  better  for 
the  first  few  lowest  Ra  studied  but  does  not  work 
as  well  for  the  intermediate  range  of  Ra.  For  the 
highest  Ra  studied  (~10”-10''*),  it  fits  the  data 
better  again  and  is  close  to  the  stretched- 
exponential  fit  which  is  similar  to  the  exponential 
fit  for  lower  Ra.  This  can  be  seen  from  the  value 
of  the  parameters  /3  which  is  approximately  1  for 
low  Ra  and  close  to  5  for  higher  Ra  (see  below). 
The  intermediate  range  of  Ra  (lO'^sRas  10‘‘) 
is  best  fitted  by  the  exponential  fitting  form.  As 
mentioned  above,  the  power  spectra  “collapsed" 
into  one  universal  curve  for  Ra^7.3  x  lo'"  [4], 
Thus,  we  expect  that  the  data  of  lower  Ra  should 
also  be  better  fitted  by  the  exponential  fit.  The 
result  that  the  exponential  one-half  fit  actually 
works  better  for  the  data  of  the  first  few  lowest 
Ra  is  understood  as  follows.  For  the  data  of  the 
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Fig.  1.  Quality-checking  of  the  three  different  fitting  forms,  the  exponential  one-half,  the  exponential  and  the  stretched- 
exponential  fits  (see  eqs.  (7),  (8)  and  (10),  respectively),  for  four  different  Rayleigh  numbers,  (a)  Ra  =  6.0  x  10",  (b) 
Ra  =  7.3  X  10*”,  (c)  Ra  =  6.7  x  10'",  and  (d)  Ra  =  4.3  x  10'".  We  plot  the  ratio  of  the  data  of  the  power  spectrum  to  the  result  of 
fit,  r,  versus  the  frequency,  w.  The  dotted  line  is  for  the  exponential  one-half  fit,  the  dot-dashed  line  is  for  the  exponential  fit  and 
the  solid  line  is  for  the  stretched-exponential  fit.  It  can  be  seen  that  none  of  the  three  fitting  forms  works  too  well  for  the  highest 
Ra. 


first  few  lowest  Ra  studied  here,  the  range  of  the 
power  law  is  very  short  so  it  is  very  difficult  to  fit 
the  data  with  a  power-law  decay  and  then  an 
exponential  cutoff.  Rather,  the  whole  region  is 
better  approximated  by  just  an  exponential  one- 
half  decay  (the  value  of  jc,  is  about  0  for  the 
lowest  Ra,  see  below).  None  of  the  three  fitting 
forms  fits  the  data  of  Ra  =  4.3  x  10“*  (the  highest 
Ra  studied)  well. 


The  exponential  fit  is  good  for  data  of  inter¬ 
mediate  values  of  Ra  but  not  as  well  for  higher 
Ra.  This  point  has  already  been  noted  in  ref.  [6]. 
In  particular,  if  one  plots 

G(a;)  =  P(£u)exp(  — )  (11) 

versus  w  in  a  log-log  plot,  one  does  not  get  a 
reasonably  good  straight  line.  Instead,  the  curve 
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appeared  [6]  to  develop  into  two  straight  lines, 
suggesting  G(<ii)  might  be  a  combination  of  two 
power-law  decays,  or  at  least  a  more  complicated 
form  than  a  single  power-law  decay. 

After  checking  the  quality  of  the  various  hts, 
we  then  investigate  how  the  fitting  parameters 
vary  with  Ra.  The  dependence  of  the  three 
parameters,  jc,  ,  X2  and  p,  on  Ra  is  shown  in  hgs. 
2,  3  and  4,  respectively.  We  see  that  both  jCj  and 
/3  display  a  change  in  behavior  at  Ra  about  10 
X2  scatters  around  1.3  and  then  increases  to  1.8, 
while  jS  scatters  around  0.9  and  then  decreases  to 
about  0.4.  Instead,  jc,  behaves  quite  differently: 
except  for  some  slight  scattering,  it  increases 
monotonically  from  about  0  to  1.4.  Comparing 
eq.  (10)  with  eq.  (9),  one  sees  that  if  the  multi- 
fractal-likc  fit  is  good,  then  ^  corresponds  to 
SJS  above  the  transition.  Thus,  the  reciprocal  of 
P  should  increase  linearly  with  In  Ra.  In  fig.  4, 
we  plot  1//3  versus  In  Ra  and  find  that  this  indeed 
is  true  except  for  the  highest  Ra.  This  fact, 
together  with  the  above  observation  that  the 
stretched-exponential  form  does  not  fit  too  well 
the  highest  Ra  data,  indicates  that  the  multi- 
fractal-like  transformation  may  work  only  for  an 
intermediate  range  of  Ra  as  was  noted  already  in 
ref.  [4].  The  slope  obtained  is  0.15  ±0.01  which 


Fig.  2.  The  behavior  of  the  fitted  parameter  Jt,  (see  eq.  (7)) 
as  a  function  of  Ra. 


Fig.  3.  The  behavior  of  the  fitted  parameter  jr,  (see  eq.  (8)) 
as  a  function  of  Ra.  It  can  be  seen  that  there  is  a  change  in 
behavior  at  Ra  about  10". 


Fig.  4.  The  dependence  of  the  fitted  parameter  /3  (see  eq. 
(10))  on  Ra.  A  change  in  behavior  is  seen  at  Ra  about  lO". 

agrees  with  l/5„  =  0.152  (see  appendix  for  the 
value  of  5(,). 

Next,  we  compare  the  three  frequencies  <t), ,  w, 
and  o>3  which  are  the  typical  scales  describing 
dissipation.  In  fig.  5  we  present  a  log-log  plot  of 
these  frequencies  versus  Ra.  Below  the  transi¬ 
tion,  (i>2  and  a>3  are  indistinguishable  within  mea¬ 
surement  errors.  This  fact,  together  with  the  fact 
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Fig.  5.  The  dependence  of  the  three  characteristic  fre¬ 
quencies,  ft>|,  ojj  and  toj  (se  eqs.  (7),  (8)  and  (10))  on  Ra. 
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that  /3  is  close  to  one  in  this  region  of  Ra,  just 
echo  the  result  [6]  that  eq.  (8)  with  X2  =  -1.3  ± 
0.1  is  a  good  description  of  the  power  spectra  for 
lower  values  of  Ra.  However,  for  higher  values 
of  Ra,  the  two  frequencies  deviate  from  each 
other  significantly  and  behave  differently  as  a 
function  of  Ra.  The  frequency  w,  is  smaller  than 
a<2  by  more  than  an  order  of  magnitude  but  it 
seems  to  be  proportional  to  Wj.  In  fact,  the  three 


frequencies  all  scale 

as  a  power  of  R 

/  2 

(OjL 

forRaslO"  , 

K 

1  J(^s0.62±0.05 

for  Ra  a  10"  , 

(o^L^ 

forRaslO"  , 

K 

1  RgO-S’^OOS 

for  Ra  a  10"  , 

and 

T  ^ 

W3L 

[  Ra'’  *’""  ' 

forRa:sl0"  , 

K 

[  Ra0.07i0.06 

for  Ra  a  10"  . 

(12) 

(13) 


(14) 


Thus,  all  the  three  frequencies  have  essentially 
the  same  scaling  behavior  for  Ra^slO".  This 
implies  that,  below  the  transition,  a  well-defined 
characteristic  frequency  scale  describing  dissipa¬ 
tion  exists,  which  is  independent  of  whatever 
functional  form  assumed  for  dissipation.  Above 


the  transition,  this  is  no  longer  true  as  the  be¬ 
havior  of  the  frequency  obtained  depends  cru¬ 
cially  on  the  functional  form  assumed  which  is  a 
priori  unknown.  This  point  will  be  elaborated  in 
the  following  section. 


3.  Discussion 

We  see  that  the  data  of  the  power  spectra  do 
not  distinguish  sharply  between  the  different  fit¬ 
ting  functions.  In  the  following,  we  shall  argue 
that  this  ambiguity  allows  for  different  interpre¬ 
tations  of  the  transition. 

Using  standard  ideas  in  power  spectra,  we 
associate  the  power-law  decay  with  cascade  of 
quantities,  like  kinetic  energy  flux  [14]  or  T'  flux 
[15, 17],  from  larger  scales  down  to  smaller 
scales  and  the  exponential  or  stretched-exponen- 
tial  cutoff  with  the  dissipation  of  these  quantities 
when  viscosity  or  thermal  diffusivity  becomes 
important.  The  stretched-exponential  fitting 
form  then  implies  that,  above  the  transition,  the 
cascade  remains  unchanged  but  the  form  of  dissi¬ 
pation  changes.  With  0  decreasing  as  Ra  in¬ 
creases,  it  suggests  that  the  dissipation  decreases 
as  compared  to  that  extrapolated  from  below  the 
transition.  On  the  other  hand,  the  exponential  fit 
with  the  picture  that  G(<o)  (eq.  (11))  changing 
form  implies  that  above  the  transition,  the  cas¬ 
cade  changes,  possibly  from  cascade  of  one 
quantity  to  another  giving  rise  to  a  second 
power-law,  while  the  form  of  dissipation  remains 
unchanged. 

This  possibility  of  more  than  one  interpreta¬ 
tion  is  related  to  another  interesting  question: 
what  is  the  functional  form  describing  dissipa¬ 
tion?  It  is  known  [18]  that  the  wavenumber 
power  spectra  (spatial  Fourier  transform)  must 
decay  at  least  as  fast  as  an  exponential,  or  else 
the  temperature  field  becomes  unbounded  con¬ 
trary  to  experimental  evidence.  However,  not 
very  much  can  be  said  about  the  relationship 
between  the  wavenumber  and  the  frequency  ex¬ 
cept  for  the  usually  assumed  frozen-flow  hypoth- 
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esis  [19].  The  results  of  the  above  analysis  show 
that  the  present  data  do  not  distinguish  sharply 
between  the  three  different  functional  forms  of 
dissipation. 

Hence,  from  analyzing  the  power  spectra,  we 
can  conclude  that  a  transition  occurs  at  Ra  about 
10“  but  it  is  difficult  to  characterize  it.  It  is  then 
natural  to  search  for  another,  better  characteri¬ 
zation.  The  study  [6]  of  the  root-mean-squared 
temperature  derivative  via  the  dimensionless 
quantity  Q, 


^2_{dTldtf 
^  "  AIk^/L*  ’ 


(15) 


with  the  overbar  denoting  a  time  average  and  A^ 
the  root-mean-squared  temperature  fluctuation, 

Al  =  {T-t)\  (16) 


shows  that  Q  has  a  sharp  change  of  behavior  at 
Ra  about  10* 


forRaslO", 

forRaalO“.  ^  ’ 

This  change  in  the  scaling  exponent  clearly  quan¬ 
tifies  the  transition. 

The  study  of  Q  provides  other  clues  to  the 
nature  of  the  transition.  Q  is  essentially  the  rate 
of  temperature  fluctuations,  in  units  of  A^kIL^. 
Since  A^  is  the  measured  root-mean-squared  tem¬ 
perature  fluctuation,  Q  just  defines  a  frequency 
scale,  in  units  of  kIL^,  over  which  fluctations 
occur.  We  expect  Q  to  be  dominated  by  fast 
fluctations,  thus  this  frequency  scale  is  the  in¬ 
verse  of  the  shortest  timescale  appeared  in  the 
problem.  The  shortest  timescale,  t,,  previously 
identified  in  the  problem,  was  found  [20]  to  scale 
as  a  power  of  Ra: 


T,  _  I  for  Ra  s  lO"  , 

forRaslO". 


(18) 


We  compare  1/(2  and  t,  in  fig.  6  and  see  that 
they  indeed  have  the  same  scaling  behavior.  The 
increase  of  the  scaling  exponent  in  eq.  (18) 
implies  that  above  the  transition,  the  typical 
timescale,  over  which  fluctuations  occur,  in¬ 


Fig.  6.  Comparison  of  MQ  (see  cq.  (15)  for  definition  of  Q) 
with  t,k/L’  where  t,  is  the  shortest  timescale  identified  in 
the  problem.  It  can  be  clearly  seen  that  the  two  quantities 
behave  similarly. 

creases  as  compared  to  that  extrapolated  from 
below  the  transition.  Using  the  frozen-flow  type 
argument  [19],  a  longer  timescale  means  a  longer 
lengthscale.  As  we  expect  most  of  the  tempera¬ 
ture  changes  occur  across  the  cap  of  thermal 
plumes  (which  are  one  of  the  coherent  structures 
observed  [21]  in  the  problem),  a  longer 
lengthscale  then  suggests  that  the  surface  of  the 
plumes  is  more  convoluted.  This  may  thus  be 
related  to  the  suggestion  that  this  transition  ar¬ 
ises  from  the  roughening  of  thermal  plumes  [6]. 
However,  we  have  to  admit  that  all  these  argu¬ 
ments  are  highly  speculative. 


4.  Conclusion 

We  perform  a  detailed  analysis  on  the  fre¬ 
quency  power  spectra  of  temperature  fluctua¬ 
tions  and  show  that  data  are  equally  well  fitted 
by  three  different  fitting  forms.  Consequently, 
different  physical  interpretations  of  the  nature  of 
the  transition  are  allowed.  Therefore,  it  is  dif¬ 
ficult  to  characterize  the  transition  using  power 
spectra.  On  the  other  hand,  the  dimensionless 
root-mean-squared  temperature  derivative.  Q, 
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scales  as  a  power  of  Ra,  with  the  exponent 
decreasing  from  0.67  ±  0.04  to  0.48  ±  0.06  at  Ra 
about  10  “.  This  decrease  in  dissipation  clearly 
quantifies  the  transition. 
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Appendix.  Derivation  of  eq.  (9) 


It  was  shown  in  ref.  [4]  that  all  the  power 
spectra,  for  Ras:7.3  x  10'‘,  collapse  into  one 
single  curve,  f(a),  under  the  following  multi¬ 
fractal-like  transformation, 

>n[n^)/Po]  ^  ln(a)/a>o) 

^  ln(Ra/Ro)  ’  "  ln(Ra//?o)  ’  ^ 

where  P^,  and  Rg  are  some  constants.  This 
implies  that  for  Ra»7.3xl0“,  P((o)  is  de¬ 
scribed  by 

Fi<^)  =  Pq  exp[5/(a)]  with  5  =  In(^)  .  (20) 

The  functional  form  of  /(a)  can  be  obtained  by 
assuming  that  the  multifractal-like  fit  has  a  struc¬ 
ture  of  eq.  (8).  This  is  justified  by  the  observa¬ 
tion  that  the  data  of  Ra  =  7.3  x  lO'*  is  also  well 
appproximated  by  eq.  (8)  (see  fig.  Ic).  Rewriting 
eq.  (8)  in  terms  of  a,  we  have 

F,(aj)  =  Po  exp(5(c,  -  e^'’”)] , 


where 


(21) 


with  0)''-=  At-  Comparing  eq.  (20)  with  eq. 
(21),  we  find 


/(a)  =  c, -xja-c^e''""  (22) 

agreeing  with  the  form  given  by  Castaing  [16]. 
Using  Po  =  (5.8±0.7)x  10"",  a>„L-/K  =  (l.l± 
0.2)  X  10^  and  /?„  =  1  x  10**  from  ref.  [4],  and 
X2-I.33,  wL^/k  =  1.35  X  10^  and 
K  =  1.26  X  lO**  from  the  fit  of  eq.  (8)  for  Ra  = 
7.3  X  10'*’;  we  have 


=  6.59  ,  c,  =  1 .29  ,  c,  =  0.013  .  (23) 


Except  for  c,,  these  values  agree  with  Castaing’s 
results  [16].  Finally,  using  eqs.  (20)  and  (22),  we 
see  that  the  multifractal-like  fit  with  a  structure 
of  eq.  (8)  can  be  given  in  the  following  form; 


F{to) 


X  expl-C2(a)/a)„y^"'''] . 


(24) 


This  is  just  eq.  (9). 
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Local  heating  of  a  liquid  just  below  its  surface  may  generate  surface  waves  and  hydrodynamical  oscillations.  The 
resulting  system  exhibits  a  great  wealth  of  behaviour  when  a  laser  beam  is  used  for  heating,  leading  to  the  creation  of  an 
unsteady  thermal  lens.  Chaotic  time  signals  have  been  recorded  and  analyzed  using  combined  approaches;  power  spectra, 
3D  projections,  Poincare  sections,  periodic  orbits,  generalized  Renyi  dimensions.  These  approaches  are  applied  to  the 
characterization  of  one  typical  experimental  chaotic  attractor.  The  usefulness  of  each  technique  is  outlined  and,  finally,  a 
likely  value  for  the  information  dimension  of  the  attractor  is  found  to  be  close  to  4. 


1.  Introduction 

In  the  rapidly  growing  held  of  nonlinear 
dynamics,  there  still  exists  a  need  for  typical, 
well  controlled  and  cheap  experiments.  To  our 
knowledge,  the  most  celebrated  and  most 
studied  experiments  seem  to  be  Rayleigh- 
Benard  instability  [1-4],  Belousov-Zhabotinsky 
reaction  [5, 6],  nonlinear  electrical  circuits  [7, 8], 
lasers  [9]  or  Couette-Taylor  systems  [10], 

The  above  list  is  by  no  means  a  complete 
review  of  the  existing  experiments  but  it  only 
recalls  well-known  examples  of  experimental 
nonlinear  dynamical  systems. 

The  oscillatory  thermal  lens  is  another  good 
candidate  for  being  a  generic  experimental  non¬ 
linear  dynamical  system. 

To  our  knowledge,  this  phenomenon  was  first 
mentioned  in  1973  in  a  very  brief  report  [llj.  It 
was  independently  discovered  by  our  group  in 
1981  [12]  and  since  that  time  we  have  been 
continuously  studying  it.  Progressively  our  work 
has  set  the  oscillations  of  the  thermal  lens  in  the 
very  middle  of  nonlinear  dynamics  with  con¬ 


nections  to  bifurcation  and  instability  theory, 
Feigenbaum  cascade,  quasiperiodicity  and  chaos. 
When  the  oscillations  are  due  to  the  thermal  lens 
produced  by  absorption  of  the  laser  beam,  a 
beautiful  ring  pattern  is  the  consequence  of  in¬ 
terferences  inside  the  beam  passing  through  the 
thermal  lens.  Oscillations  of  the  thermal  lens  are 
easily  detected  by  oscillations  of  the  ring  pattern. 
They  look  like  heartbeats  at  frequencies  close  to 
1  Hz  and,  accordingly,  the  phenomenon  has 
been  named  optical  heartbeat  (HB). 

Recently,  Bazhenov  et  al.  [13],  Viznyuk  and 
Sukhodol’skii  [14]  published  papers  on  ex¬ 
perimental  study  of  the  oscillations  of  a  heated 
surface  and  their  work  witnesses  an  increasing 
interest  about  these  free  surface  instabilities. 

The  present  paper  is  focused  on  description 
and  analysis  of  experimental  results  and  data. 
More  precisely  we  will  concentrate  on  new  re¬ 
sults  about  chaotic  experimental  signals  and  out¬ 
line  the  usefulness  of  multiple  approaches  for 
analyzing  experimental  time  series.  Only  short 
outlines  will  be  devoted  to  experimental  results 
for  non  chaotic  regimes  and  we  will  skip  discus- 
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sion  of  theoretical  models.  Details  about  non 
chaotic  regimes  of  the  HB-system  can  be  found 
in  refs.  [15-20]  and  about  the  theoretical  models 
in  refs.  [21-23]. 

2.  Experiments 

2.1.  Experimental  set-up  and  data  acquisition 
{HBl  system) 

Quantitative  and  detailed  information  about 
experiments  has  been  reported  elsewhere  [12, 
15-20,24-31].  Let  us  only  recall  that  all  experi¬ 
ments  basically  rely  on  a  continuous  heating  of  a 
liquid  close  to  the  free  surface.  Many  geomet¬ 
ries,  configurations  and  liquids  have  been  tested. 
In  the  present  experiment  called  heartbeat  1 
(HBl-experiment),  a  horizontal  laser  beam  is 
focused  below  the  free  surface  of  an  absorbing 
liquid  (fig.  1).  Hence,  the  focus  point  is  heated 
and  becomes  surrounded  by  a  refractive  index 
gradient  acting  as  a  diverging  lens.  After  crossing 
the  thermal  lens,  the  laser  beam  diverges  and  a 
high  contrast  ring  pattern  is  produced  by  what 
may  be  understood  as  thermal  lens  aberrations. 
This  pattern  can  be  projected  onto  a  screen  and, 
as  the  pattern  is  large  (0.1  m  up  to  1  m),  it  is  the 
best  way  for  showing  and  analyzing  the  very 
small  thermal  lens  (=10"^  m).  Thermal  lensing  is 
a  well-known  phenomenon  which  was  reported 
in  the  sixties,  for  instance,  by  Gordon  et  al.  [32]. 
Conversely,  oscillations  of  a  thermal  lens  have 
been  studied  more  recently,  in  the  eighties  [12]. 
This  phenomenon  occurs  when  the  distance  be¬ 
tween  the  heated  point  and  the  surface  is  small 
enough  for  feed  back  from  surface  waves  to  the 


Fig.  1.  Heartbeat  1  experiments  (principle). 


heated  zone.  When  the  thermal  lens  departs 
from  steadiness,  the  ring  pattern  evolves  to 
HB’s.  To  take  a  picture,  the  ring  patterns  behave 
like  quiet  heartbeats  when  the  hydrodynamical 
system  oscillates  periodically  and  they  can  be 
compared  to  a  heart  attack  when  chaos  is 
reached  (this  comparison  holds  between  the  as¬ 
pects  of  the  optical  pattern  and  of  the  human 
heart.  Conversely,  the  underlying  electrical  ac¬ 
tivity  of  the  human  heart  may  follow  chaotic 
dynamics  when  healthy  and  may  depart  from 
chaos  in  case  of  a  heart  attack  [33].) 

Fig.  2  is  a  schematic  of  the  experimental  set¬ 
up;  the  source  is  an  Argon  ion  laser  (item  1). 
Laser  power  P  is  adjusted  by  rotation  of  the 
plane  of  polarization  relative  to  a  polarizer  (Gian 
prism  and  polarizer,  item  3).  Then,  the  laser 
light  comes  through  a  window  inside  a  thermo- 
regulated  box  (item  11)  which  contains  all  other 


Fig.  2.  Experimental  set-up  for  HBl:  (1)  Ar^  laser;  (3) 
power  adjustment;  (6)  focusing  lens;  (7)  pellicle  beam  split¬ 
ter;  (8)  sealed  cell  containing  liquid  and  colorant;  (9)  photo¬ 
diode;  (10)  translucent  screen;  (11)  thermoregulated  room; 
(12)  neutral  attenuator;  (13)  photodiode;  (16)  data  acquisi¬ 
tion  and  storage. 
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pieces  of  the  set-up  except  data  storage  and 
processing. 

Inside  the  thermoregulated  box  we  find  a  pelli¬ 
cle  beam  splitter  (item  7)  allowing  permanent 
control  and  measurement  of  the  incident  laser 
power  (photodiode  +  attenuator,  items  12-13). 
Item  6  IS  the  lens  focusing  the  beam  on  the 
sealed  cell  containing  a  solution  of  toluene  with 
50mg/€  Red  Organol  BX  1750  dye  (item  8).  A 
micropositioner  allows  variation  of  the  distance  d 
between  laser  beam  and  surface.  Finally  the 
HB’s  fringes  are  smoothed  by  a  translucent 
screen  (item  10)  and  measured  by  a  photodiode 
(item  9).  The  experimental  signals  from  both 
photodiodes  are  fed  to  an  acquisition  interface 
and,  after  processing,  they  are  stored  in  a  PC 
(item  16). 

A  real  time  acquisition  and  sampling  of  ex¬ 
perimental  data  is  performed  by  the  PC  inter¬ 
face.  The  signal  from  photodiode  9  is  sampled  at 
a  chosen  fixed  frequency,  then  it  is  possibly 
amplified  and  extracted  from  continuous  back¬ 
ground.  Finally  it  is  stored  as  a  two-byte  integer. 

Sampling  frequency,  amplification  ratio  and 
offset  can  be  adjusted  in  the  range  0  Hz- 
1000  Hz,  1-128  and  OV-IOV,  respectively.  The 
bandpass  cut-off  of  the  time  signal  is  about 
100  Hz. 

2.2.  Physical  understanding  of  experiments 

HB  experiments  essentially  rely  on  buoyancy, 
thermal  diffusion  and  Marangoni  effect. 

Buoyancy  and  thermal  diffusion  are  the  agents 
which  transfer  heat  from  the  heated  region  up  to 
the  free  surface,  d  being  the  distance  between 
surface  and  heated  zone,  buoyancy  is  likely  to  be 
predominant  for  high  d  values  and  thermal  diffu¬ 
sion  for  small  d  values. 

The  Marangoni  effect,  due  to  a  negative  tem¬ 
perature  derivative  of  surface  tension  ejects  hot 
liquid  away  when  it  reaches  the  surface. 

When  the  parameters  are  correctly  adjusted, 
oscillations  can  grow  inside  the  liquid  together 
with  surface  waves. 


Various  stationary  behaviours  can  be  obtained 
by  varying  the  thermophysical  properties  of  the 
liquid  (nature  of  the  liquid,  boundary  tempera¬ 
ture),  the  distance  d  from  the  surface  and  the 
rate  of  heating.  However,  the  pertinent  thermo¬ 
physical  parameters  have  not  yet  been  identified 
and,  up  to  now,  we  cannot  a  priori  predict  which 
gas-liquid  interfaces  will  exhibit  unsteady  be¬ 
haviours. 

Typical  physical  quantities  are  the  frequency 
of  the  phenomenon  (=1-10  Hz)  and  the  maximal 
distance,  between  the  free  surface  and  the 
heated  zone  (=5  x  10^^  m).  Inside  the  fluid  the 
typical  scales  are  very  small  (down  to  5  x 
10“*  m),  accordingly  the  temperature  field  re¬ 
mains  unknown. 

In  all  experiments  performed  by  our  group, 
the  variable  parameters  are  distance  d  and  rate 
of  heating,  while  the  nature  of  the  liquid  and  the 
boundary  temperature  are  kept  fixed. 

2.3.  Field  of  dynamical  state 

Fig.  3  shows  a  HBl  state  diagram  in  parameter 
plane  {P,  d),  P  being  the  laser  power  and  d  the 
distance  between  focus  point  and  free  surface. 
The  external  (boundary)  temperature  is  moni¬ 
tored  and  held  fixed  at  a  constant  value  (25  ± 
0.05°C).  The  liquid  is  toluene  colored  by  Red 
Organol  BX1750  (0.05  kg/m^).  Dynamical  states 
and  bifurcations  have  been  identified  by  the 
power  spectra  of  the  experimental  time  signals, 
and  checked  by  other  approaches  (see  section  4) . 
Curve  CTl  is  the  frontier  between  steadiness 
(below,  labelled  S)  and  periodicity  (above,  la¬ 
belled  P).  The  transition  on  CTl  is  a  super¬ 
critical  Hopf  bifurcation. 

In  the  experiment,  the  operator  scans  over  d 
at  fixed  power  P.  Scanning  for  both  increasing 
and  decreasing  d  values  shows  no  evidence  for 
hysteresis  within  experimental  accuracy. 

Although  each  power  value  was  scanned  very 
precisely  (d-steps  up  to  20(jim),  at  high  d's,  the 
location  of  curve  CTl  suffers  high  uncertainties 
(see  error  bars  on  fig.  3)  because  of  the  very  low 
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Fig.  3.  Power-distance  state  plane  of  HBl  system  (toluene  with  0.05  kg/m’  Red  Organol  BX1750;  fixed  boundary  temperature: 
25±0.05°C).  Continuous  and  dashed  lines  are  the  frontiers  between  steadiness  (S),  periodicity  (P),  period  doubling  (DP)  and 
chaos  (CH).  Points  with  labels  refer  to  experimental  results  reported  in  the  paper. 


amplitude  of  the  oscillations  and  because  of  their 
intermittency. 

Only  the  firmly  established  periodic  (label  P), 
doubly  periodic  (label  DP)  and  chaotic  (label 
CH)  zones  of  diagram  (P,  d)  have  been  shown 
on  fig.  3.  Smaller  windows  have  been  observed 
(chaos,  periodicity,  quasi-periodicity,  chaos  with 
periodic  component)  but  their  size  is  too  small  to 
ensure  reproductibility  and  precise  location  on 
diagram  (P,  d). 

With  silicon  oils  whose  thermophysical  prop¬ 
erties  are  better  adapted  to  the  hydrodynamical 
experiment,  the  state  diagram  can  be  drawn  with 
a  higher  precision  and  the  following  behaviours 
have  been  observed  [18, 19, 25]: 

Periodic  with  one  fundamental  frequency, 
periodic  with  subharmonics  corresponding  to  a 
Feigenbaum  cascade,  and  quasiperiodic  with  two 
fundamental  frequencies.  Frequency  lockings 
have  also  been  observed  when  the  two  fun¬ 
damental  frequencies  of  the  quasiperiodic  signals 
become  commensurable,  leading  to  the  ex¬ 
perimental  observation  of  devil  staircases.  Hy¬ 
steresis  also  has  been  detected. 


However,  temporal  chaos  has  not  been  ob¬ 
served  with  silicon  oils  though  it  has  been  ob¬ 
served  with  toluene.  Hence,  section  4  which  is 
devoted  to  chaotic  HBl  behaviour  relies  on  ex¬ 
periments  with  toluene. 

3.  Combined  approaches  and  characterization 
of  chaotic  regimes 

Chaotic  regimes  can  be  detected  by  their  high 
level  continuous  background  in  power  spectra. 
However,  such  a  continuous  background  is  not  a 
unique  signature  of  chaos  and  a  precise  charac¬ 
terization  requires  going  beyond  power  spectra 
toward  quantitative  characteristics  of  chaos. 

The  present  section  briefly  recalls  the  main 
features  of  the  (classical)  approaches  that  we 
used  for  the  characterization  of  chaotic  time 
series,  namely,  qualitative  2D  projections  of  3D 
embeddings,  Poincare  sections,  detection  of  re¬ 
current  points  and  unstable  orbits  and  quantita¬ 
tive  characterization  by  Renyi  dimensions. 

Applications  of  the  above  approaches  for 
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analyzing  experimental  time  series  will  be  re¬ 
ported  in  section  4. 

3.1.  Qualitative  approaches  of  the  chaotic 
regimes 

3.1.1.  n-dimensional  reconstruction  and  2D 
projection  of  3D  embeddings 

The  analysis  of  the  chaotic  experimental  signal 
begins  by  the  reconstruction  of  a  trajectory  in  an 
n-dimensional  phase  space.  By  using  the  time 
delay  method  first  proposed  by  Packard  [34]  and 
established  by  Takens  [35, 36],  we  reconstruct  a 
projection  of  the  (assumed)  underlying  attractor 
embedded  in  R".  For  sake  of  brevity,  this 
geometric  structure  will  be  called  “attractor” 
throughout  the  rest  of  the  paper.  Roughly  speak¬ 
ing,  the  time  delay  method  operates  as  follows: 
the  physical  signal  x(t)  is  sampled  into  time 
series  of  N  experimental  points  jc(/o  +  >  At)  (Is 
isN;  At  sampling  time;  beginning  of  mea¬ 
surements).  yV- point  trajectories  (more  exactly 
INT([N- (n  -  l)p]/m)  can  be  reconstructed  in 
R",  vector  Xj  having  coordinates  =  x(tQ  + 
jm  At  +  kp  AO  (0  s  y  <  INT([Ar  -  (n  -  l)p]  /m); 
0sA:<n-l;  m  and  p  “well  chosen”  strictly 
positive  integers),  p  Ar  is  the  time  delay.  A  result 
of  the  time  delay  method  is  that  the  Xfs  are 
randomly  distributed  with  respect  to  the  natural 
measure  over  the  reconstructed  attractor. 

Firm  foundations  of  R”  reconstructions  are 
provided  by  the  Mane  theorem  [37]  (although 
contested  by  Sauer  et  al.  [38])  about  parametri- 
zation  of  compact  sets  by  real  coordinates  and  by 
its  extension  to  Takens’  theorem  [35]  proving 
that,  for  n'sz2Dyi  +  \,  topology  preserving  re¬ 
constructions  are  dense  among  reconstructions. 

R^  reconstructions  can  easily  be  2D-projected 
on  computer  screens  and  are  a  big  help  in 
characterizing  the  attractor.  Obviously,  we  know 
that  R^  is  not  a  good  choice  for  reconstructing 
and  embedding  the  experimental  attractor  since 
it  does  not  comply  with  Takens’  criterion.  Fur¬ 
thermore,  dimension  computations  exhibit  a  di¬ 
mension  close  to  4  for  the  experimental  attractor 


(see  section  4).  3D  embedding  will  then  reduce 
the  attractor.  Nevertheless,  2D  projections  of  3D 
embeddings  can  be  used  as  an  interactive  tool 
showing  real  time  evolution  of  the  dynamical 
states  and  simplifying  the  analysis  of  power  spec¬ 
tra  which  can  be  associated  with  the  aspect  of  the 
3D  embedding  (e.g.  fig.  4). 

3.1.2.  Poincare  sections,  recurrent  points  and 
periodic  orbits 

Poincare  sections  containing  the  crossing 
points  between  the  reconstructed  phase  space 
trajectory  and  a  given  surface  [39]  are  useful  to 
evidence  periodic  and  quasi-periodic  oscillations. 
Well  defined  clusters  of  points  on  the  Poincare 
section  are  the  signature  of  periodicity  or  quasi¬ 
periodicity.  Conversely,  cloudy  sections  do  not 
allow  sharp-cut  conclusions  because  these  sec¬ 
tions  may  result  either  from  high  dimension 
quasi-periodicity  or  from  chaos. 

Another  approach  that  we  followed  only  in  a 
qualitative  way  is  the  extraction  of  recurrent 
points  and  periodic  orbits  from  the  attractor. 
€-recurrent  points  of  order  m  are  defined  by  the 
fact  that  the  phase  trajectory  comes  into  a  neigh¬ 
bourhood  €  of  these  points  after  lag  time  m  At 
[40,  41].  The  peaks,  in  the  histograms  of  these 
e-recurrent  points  as  a  function  of  order  m, 
reveal  possible  periodic  orbits  in  the  attractor. 
Theory  of  dynamical  systems  establishes  that 
many  chaotic  strange  attractors  have  a  skeleton 
of  unstable  periodic  orbits  which  are  dense  on 
the  attractor  [42,43].  The  real  phase  trajectory 
will  be  captured  by  unstable  orbits  along  the 
stable  manifold  and  sooner  or  later  will  leave  the 
orbits  because  of  the  divergence  along  the  un¬ 
stable  manifold.  Thus,  the  detection  of  portions 
of  periodic  orbits  reveals  the  underlying  skeleton 
of  the  attractor  and  may  help  in  understanding 
its  organization  as  well  as  its  evolution  under 
change  of  parameters. 

3.2.  Quantitative  characterization  of  the 
attractor 

A  given  attractor  can  be  quantitatively  charac- 
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terized  by  numbers  like  Renyi  dimensions, 
generalized  metric  entropies  and  Lyapunov 
exponents.  We  mainly  computed  dimensions  be¬ 
cause  they  are  a  first  clue  for  the  estimation  of 
the  real  phase  space  dimension. 

Renyi  dimensions  {q  real)  are  global 
characterizations  of  the  distribution  of  a  given 
quantity  (mass,  charge,  density,  probability  of 
presence, . . .)  over  a  support  set  defined  by  its 
geometric  measure  /n  [44-46].  For  dynamical 
systems  and  attractors,  the  involved  quantity  is 
the  distribution  of  trajectory  points  in  phase 
space  and  the  geometric  object  is  the  set  of 
representative  phase  space  points  for  time  tend¬ 
ing  to  infinity.  The  dimension  is  the  (space) 
scaling  exponent  of  the  /u, -weighted  (^-l)th 
moment  of  the  probability  p{i,  r)  that  a  point  of 
the  attractor  lie  inside  ball  i  with  radius  r. 

Furthermore,  the  measure  p.  itself  is  the  prob¬ 
ability  of  presence  which  is  the  “natural  mea¬ 
sure”  over  the  attractor.  Thus,  given  a  partition 
B  of  the  attractor  by  boxes  of  equal  size  r,  the 
following  scaling  law  holds  in  the  limit  0  and 
defines  the  set  of  Renyi  dimensions: 

linjS  {/Ki,  '•)[p<i.  Ol*''}  “  ,9^1. 

r-.0  B 

(1) 

The  natural  probability  measure  being  er- 
godic,  the  partition  B  can  be  replaced  by  averag¬ 
ing  over  m  boxes  randomly  distributed  with  re¬ 
spect  to  the  measure  and  the  scaling  law  can  then 
be  written  either  with  respect  to  the  radius  (fixed 
radius)  or  with  respect  to  the  probability  (fixed 
mass)  [47]. 

The  relevant  formulae  are 

q¥=\ 

r-*0  m 

(2) 

for  the  fixed  radius  approach,  and 

p-*0  m 

(3) 


for  the  fixed  mass  approach,  5(y,  p)  being  the 
size  of  box  j  containing  probability  p  and  D{y) 
being  connected  to  Renyi  dimensions  by 

D^  =  D{y  =  {\-q)D^).  (4) 

In  the  limit  q-*  \  (y^O)  eqs.  (2),  (3)  reduce 
to  the  definition  of  the  information  dimension 
O,: 

m  j 

limS  —  Iog/j(j,r)  =  D,logr,  (5) 

lim  -  S  log[5(  /,  p)]  =  log  p '  .  (6) 

In  practical  computations,  probability  p(i,  r)  is 
equal  to  the  ratio  N^IN,  the  number  of  points 
lying  inside  box  i  over  the  total  number  of  points 
on  the  attractor.  Similarly,  in  the  fixed  mass 
approach,  6(/,  p)  is  the  distance  of  the  kth 
nearest  neighbour  if  p  is  taken  equal  to  kIN. 
However,  for  experimental  time  series,  the  res¬ 
olution  of  the  data  is  finite  and  N  also  remains 
finite.  Accordingly,  the  limits  r-»0  and  p-*0 
cannot  be  reached  and  the  actual  scaling  prop¬ 
erties  which  can  be  computed  do  not  necessarily 
identify  with  the  true  ones;  (i)  in  the  fixed  radius 
method,  the  scaling  law  will  hold  only  over 
finite,  and  sometimes  unidentifiable  r-intervals, 
specially  for  the  case  9  ^  0  when  the  poorly  filled 
boxes  significantly  contribute  to  the  r.h.s.  of  eqs. 

(2)  and  (5),  (ii)  in  the  fixed  mass  approach,  the 
asymptotic  behaviour  for  N— sometimes  will 
not  be  reached  and  the  resulting  overestimation 
of  5  may  significantly  bias  D{y)  for  y  aO  (eqs. 

(3)  and  (6)). 

Such  limitations  and  biases  which  are  due 
mainly  to  finite  sets  of  points,  lacunarity,  drift 
and  noise  [48-51],  may  also  arise  from  the  finite 
precision  of  data  (below  12-bit  resolution  [52]) 
or  from  a  badly  chosen  sampling  frequency  [53]. 
In  the  present  work,  we  handle  sets  of  more  than 
10*  data  points  with  15  bit  precision.  The  sam¬ 
pling  frequency  has  been  chosen  small  enough  to 
avoid  underestimation  of  dimensions  ( p  A/  = 
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one  eighth  of  pseudo-period)  and  the  embedding 
dimension  and  time  step  have  been  empirically 
adjusted  by  trial  and  error  until  the  largest  scal¬ 
ing  intervals  were  obtained  (see  section  4.2). 

Systematic  optimization  of  these  empirical 
procedures  is  desirable  [54, 55]  but  likely  would 
only  bring  minor  improvements  of  our  results. 

A  last  remark  about  Renyi  dimensions  is  that 
the  set  of  Renyi  dimensions  complies  with  the 
inequality  [45] 

(7) 

The  higher  q,  the  more  significant  will  be  the 
relative  contribution  of  dense  regions  of  the 
attractor  to  summation  in  eq.  (2).  Conversely, 
sparsely  occupied  regions  will  significantly  in¬ 
fluence  the  value  of  for  strongly  negative  q. 
Hence  scanning  on  q  from  -<»  to  -t-oo  is  equiva¬ 
lent  to  scanning  on  probability  density  from  less 
densely  filled  regions  to  most  densely  filled  ones. 
However,  each  value  is  an  average  of  the 
scaling  exponent  over  the  attractor  and  it  does 
not  give  any  insight  on  the  location  of  local 
scaling  exponents  on  the  attractor  (local  dimen¬ 
sions)  [56].  Badii  and  Broggi  [51]  have  pointed 
out  that  the  fixed  radius  method  is  better  suited 
for  high  ^-values  whereas  the  fixed  mass  ap¬ 
proach  operates  better  for  small  q’s. 

4.  Experimental  cImm^  attractors 

As  stated  in  the  introduction  of  the  present 
paper,  chaotic  attractors  have  been  detected  in 
the  experimental  time  series  of  the  HBl  experi¬ 
ment.  The  present  section  is  devoted  to  these 
experimental  results.  We  shall  show  how  the 
HBl  system  evolves  when  varying  the  parame¬ 
ters  and  then  characterize  one  of  the  experimen¬ 
tal  attractors. 

4.1.  Evolution  of  the  system  under  changes  of 
parameters 

Figs.  4a-4g  gather  a  sequence  of  7  different 


Fig.  4.  Time  signal,  power  spectrum  (ordinate  is  the  square 
root  of  the  amplitude  with  arbitrary  units)  and  3D-projection 
of  a  sequence  of  dynamical  states  for  decreasing  focus- 
surface  distance  (labels  a-g  in  fig.  3).  (a)  periodic  state  (laser 
power;  66  mW;  distance  from  bottom  of  meniscus:  0.49  mm). 
(l’)~(g)  period  doubling,  interwoven  orbits,  periodic  orbit, 
and  random  jumps  between  periodic  orbits  (laser  power; 
121  mW;  distance  from  bottom  of  meniscus  in  mm:  0.39; 
0.32;  0.22;  0.18;  0.16;  0.11. 

dynamical  states  of  the  thermal  lens.  All  states, 
but  the  first  one,  have  been  recorded  with  the 
same  laser  power  P  =  121  mW  and  they  are  or¬ 
dered  from  high  to  low  focus-surface  distances 
(except  the  first  one,  experiments  are  located  on 
the  same  horizontal  line  at  the  top  of  fig.  3.  They 
are  ordered  from  right  to  left.) 

For  each  state  of  the  system,  we  give  a  part  of 
the  time  signal,  the  associated  power  spectrum 
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Fig.  4.  Continued. 


and  a  3D  projection.  We  choose  high  sampling 
frequencies  (1000  or  2000  Hz)  which  give  high 
resolution  on  3D  embeddings.  For  dimension 
computations,  the  sampling  frequency  must  be 
lower. 

The  2D  projections  of  3D  embeddings  bring 
much  more  information  than  the  time  signals  and 
power  spectra  alone.  For  high  distances  d,  we 
observe  a  periodic  state  with  fundamental  fre¬ 
quency  at  about  2  Hz  and  several  harmonics. 
The  2D  projection  shows  a  cycle  with  two  differ¬ 
ent  regions:  a  concentrated  zone  and  a  large 
sparsely  populated  orbit.  The  corresponding 
phenomena  in  the  time  signal  are  a  slow  vari¬ 
ation  of  light  intensity  followed  by  a  sharp  peak 


(fig.  4a).  When  we  decrease  the  distance  d, 
period  doubling  is  observed  and  the  2D  projec¬ 
tion  precisely  shows  that  the  system  alternatively 
follows  each  part  of  the  twin  orbit  (fig.  4b)  [57]. 
Decreasing  d  values  further,  we  find  the  power 
spectrum  evolves  to  a  well  defined  periodic  peak 
surrounded  by  a  continuous  background  and  the 
2D  projection  reveals  that  the  system  randomly 
chooses  between  interwoven  periodic  orbits  (fig. 
4c).  The  other  states  of  the  sequence  successively 
show  a  well  defined  periodic  orbit  (fig.  4d),  and 
a  new  growth  of  the  orbit  thickness  (2D  projec¬ 
tion)  and  of  the  noisy  background  (power  spec¬ 
trum)  suggesting  the  existence  of  a  chaotic  at¬ 
tractor  (figs.  4e  and  4f).  When  the  focus  point  is 
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Fig.  4. 


close  to  the  surface,  a  periodic  state  is  reached 
again  but  with  a  small  and  noisy  signal  which 
gives  a  thick  periodic  orbit  in  the  2D  projection 
(fig-  4g). 

4.2.  Characterizations  of  a  chaotic  attractor 

Several  chaotic  states  of  the  HBl  experiment 
have  been  recorded  and  analyzed.  From  these, 
we  have  chosen  one  example  to  report  some 
results  of  the  approaches  described  in  section  3. 
This  example  is  a  dynamical  state  labelled  C17  in 
fig.  3.  The  incident  laser  power  is  66  mW,  the 
focus  is  0.08  mm  above  the  bottom  level  of  the 
meniscus  and  the  sampling  frequency  is  200  Hz. 
163  840  data  points  have  been  stored. 


The  2D  projection  of  the  3D  embedding  of  the 
attractor  and  a  Poincare  section  are  given  by  figs. 
5a,  5b,  respectively.  Fig.  5a  shows  a  cloud  of 
points  without  evident  structure  apart  from  a 
short  thick  curve  in  the  left  lower  part  and  a  very 
dense  area  in  the  middle.  The  curve  is  the  track 
of  a  well  defined  motif  in  the  time  evolution  of 
the  thermal  lens  and  the  dense  area  witnesses 
that  the  oscillations  almost  stop  and  the  data 
accordingly  cluster  around  the  same  value  before 
dispersing  again.  The  Poincare  section  of  fig.  5b 
is  the  intersection  of  the  3D  projection  with  the 
plane  y(3)  =  4500  for  trajectories  going  down¬ 
ward.  Again,  no  clear  structure  can  be  detected 
in  the  Poincare  section  which  shows  two  diffuse 
clouds.  Other  sections  in  different  regions  of  the 


Continued. 
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attractor  have  confirmed  that  it  is  sensible  to 
discard  the  possibility  that  this  is  either  a 
periodic  or  quasiperiodic  d)mamical  state.  With 
experimental  data  like  ours,  Poincare  sections 
cannot  be  used  much  beyond  the  detection  of 
periodicity  and  quasiperiodicity  because  of  the 
small  number  of  points  inside  a  given  section. 
Hence  the  analysis  of  Poincare  sections  often 
must  be  restricted  to  qualitative  observations  and 
remarks. 

Between  qualitative  and  quantitative  analysis, 
lies  the  identification  of  recurrent  points.  Fig.  6a 
is  the  histogram  giving  the  number  of  e -recurrent 
points  as  a  function  of  the  order  m  of  the 
recurrence.  Peaks  indicate  possible  periodici¬ 


ties.  On  the  histogram,  recurrence  orders  (e.g. 
m  =  11,  21,  22,  69,  93,  163,  326,  491 .  .  .)  can  be 
identified  for  e  =  20  (that  is  about  0.01  diameter 
of  the  attractor).  Peaks  163,  326  and  491  are  the 
typical  signature  of  a  strongly  recurrent  be¬ 
haviour,  the  phase  space  trajectory  following  the 
same  way  at  equally  repeated  time  intervals. 
Once  the  orders  of  recurrence  have  been  iden¬ 
tified,  it  is  then  possible  to  draw  the  correspond¬ 
ing  portions  of  the  phase  space  trajectory  and  to 
detect  how  recurrences  are  due  to  the  sticking  of 
the  trajectory  in  the  neighbourhood  of  a  periodic 
orbit  (e.g.  figs.  6c-e).  Hence,  it  appears  that 
recurrence  peaks  21-22  cover  two  different  or¬ 
bits,  the  first  one  exhibiting  a  single  medium  size 
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Fig.  4.  Continued. 
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Fig.  S.  A  typical  experimental  chaotic  time  series  (dynamical  state  labelled  CI7  in  fig.  3).  (a)  3D  display;  (b)  Poincare  section 
(K(3)  =  4500,  trajectory  going  downward). 
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Fig.  6.  £>etection  of  e-recurrent  points  and  periodic  orbits 
(e  =  20;  sampling  frequency  200  Hz),  (a)  number  of  e-recur¬ 
rent  points  vs  recurrence  order  m  (R’  embedding.  R'°  ex¬ 
hibits  the  same  features  but  with  a  more  noisy  histogram). 
(b)-(e)  2D-projections  of  the  attractor  (b)  and  of  orbits 
corresponding  to  recurrence  orders  21  (c),  22  (d),  (69)  (e) 
(same  axes  and  scales). 
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Fig.  6.  Continued. 


loop  on  2D  projection  (fig.  6c)  and  the  second 
one  being  a  large  two-loop  orbit  (fig.  6d).  For  a 
higher  recurrence  order  like  69  (fig.  6e),  the  2D 
projection  of  the  orbit  is  more  complex  showing 
at  least  four  loops.  In  fig.  6,  each  presumably 
unstable  orbit  is  shadowed  by  several  portions  of 
trajectory  close  to  each  other  that  have  been 
evidenced  by  colors  which,  unfortunately,  cannot 
be  reproduced  in  this  paper.  The  portions  of 
trajectory  look  like  broken  lines  because  the 
sampling  time  (5  10”^  s)  is  only  40  times  smaller 
than  the  pseudoperiod  of  the  signal.  Lack  of 
time  explains  why  it  has  only  been  possible  to 
draw  fig.  6  as  a  2D  projection  without  3D  em¬ 
bedding.  For  the  same  reason  and  also  owing  to 
the  small  amount  of  experimental  data,  we  have 
not  yet  been  able  to  extract  from  periodic  orbits 
further  information  like  winding  numbers  or 
symbolic  dynamics.  In  principle  this  would  be 
feasible  [58, 59]  but  it  still  requires  a  lot  of  work. 

Quantitative  characterization  of  the  attractor 
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have  been  carried  out  by  dimension  computa¬ 
tions  (see  section  3.2).  The  time  series  contains 
N  =  163  840  data  points.  Averages  involved  by 
eqs.  (2)-(6)  have  been  carried  over  m  =  10  000 
points  randomly  distributed  with  respect  to  the 
probability  measure.  The  sampling  frequency  is 
200  Hz  and  the  pseudo  frequency  of  the  time 
signal  (highest  peak  in  power  spectrum)  is  5  Hz. 
Hence,  the  time  series  contains  about  40  points 
per  pseudo  period.  The  time  delay  has  been 
adjusted  to  obtain  the  best  stability  of  the  infor¬ 


mation  dimension  D,  in  the  fixed  radius  ap¬ 
proach:  in  55“’,  a  time  delay  of  25  x  10”’  s  (that 
is  choosing  one  data  point  each  five)  gives  the 
largest  and  most  stable  plateau  for  the  estimation 
of  D^.  This  value  has  been  retained  for  all  other 
computations. 

Figs.  7-10  summarize  the  results  of  the  fixed 
radius  approach.  Fig.  7  shows  eq.  (5)  for  differ¬ 
ent  dimensions  of  the  embedding  phase  space. 
The  ordinate  in  fig.  7  is  the  l.h.s  of  eq.  (5),  i.e. 
the  average  logarithm  of  the  probability,  and  the 


Fig.  7.  Eq.  (5):  Average  logarithm  of  the  probability  of  a  box  as  a  function  of  box  size  r  (log  scale).  Different  curves  correspond 
to  different  dimensions  n  of  the  embedding  phase  space. 


in 


n 


o 

d 


in 

r,: 


O 


o 

in 


m 


<4 


o 

d 


200 


0  =  2 
=  6 
•  =  8 
♦  =  10 
•  =  12 
X  =  14 
0  =  18 
=  22 
•  =  26 
«  =  30 


Fig.  8.  Local  slope  of  curves  of  fig.  7. 


436 


S.  Meunier-Guttin-Cluze!  et  al.  /  Chaotic  attractors  in  thermal  leasing 


Fig.  9.  Information  dimension  D^  vs.  dimension  of  the  embedding  phase  space. 


abscissa  is  the  logarithm  of  the  box  size  r.  If  the 
scaling  law  (5)  holds,  the  slope  is  the  information 
dimension  Dj.  The  embedding  phase  space  di¬ 
mensions  range  over  [2,  20]  and  r  over  [70,  700]. 
A  scaling  zone  where  the  different  curves  are 
parallel  straight  lines  can  be  located  for  230  < 
r<420  (that  is  about  0.06*^0.12  in  units  of  the 


attractor  diameter).  The  local  slopes  of  the 
curves  of  fig.  7  are  shown  by  fig.  8  where  the 
scaling  zones  more  or  less  look  like  plateaus. 

For  increasing  embedding  phase  space  dimen¬ 
sions,  noise  progressively  invades  and  submerges 
small  r-values  due  to  decreasing  statistics  which 
smoothes  small  scale  structures  of  the  attractor 


Fig.  10.  Spectrum  of  Renyi  dimensions  D,.  (Some  standard  deviations  are  given,  they  increase  for  decreasing  q.) 
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(slope  decreasing  to  zero).  Simultaneously  the 
curves  are  shifted  upward  to  an  asymptotic  level. 
On  the  right,  the  curves  steepen  because  a  high 
embedding  dimension  stretches  the  attractor, 
setting  many  points  at  large  distances  that  are  of 
the  order  of  the  size  of  the  attractor. 

The  information  dimension  D^  has  been  com¬ 
puted  as  an  average  of  the  local  slopes  over  the 
scaling  zone,  the  scaling  zone  itself  being  de¬ 
termined  by  an  automatic  procedure  which  av¬ 
oids  subjective  or  ad  hoc  choices.  It  would  re¬ 
quire  too  much  space  to  detail  here  the  proce¬ 
dure.  Basically,  the  procedure  does  not  involve 
human  interpretation  except  the  choice  of  the 
value  of  the  error  bar  for  the  ordinate  of  the 
plateau  (see  ref.  [60]). 

The  results  of  the  average  is  displayed  by  fig.  9 
as  a  function  of  the  embedding  phase  space 
dimension  n.  D,  decreases  for  low  dimensions 
and  for  high  dimensions.  For  high  dimensions, 
the  decrease  is  due  to  noise  and  for  low  dimen¬ 
sions  it  automatically  results  from  the  fact  that 
D,  cannot  be  higher  than  n  for  nD-embeddings. 
For  intermediate  dimensions  an  asymptotic  value 
around  4  is  obtained:  For  instance  D^  =  4.0  in 
R**  (std  dev.  0.3)  and  D^  =4.1  in  R‘*  (std  dev. 
0.3). 

Finally,  fig.  10  shows  the  spectrum  of  Renyi 
dimensions  computed  with  the  fixed  radius 
method  in  R*®  (although  R'*  is  better  suited  for 
the  computation  of  D] ,  R*°  is  the  best  comprom¬ 
ise  when  computing  D^’s  for  various  ^-values),  q 
ranges  from  -15  to  25  but,  for  negative  values, 
the  results  are  only  rough  estimates  with  high 
standard  deviation.  As  a  matter  of  fact,  due  to 
empty  boxes,  the  local  slopes  are  strongly  under¬ 
estimated  when  the  radii  of  the  boxes  are  small. 
Thus  we  have  been  obliged  to  abandon  the 
automatic  procedure  of  averaging  the  local 
slopes  and  to  come  back  to  approximate  and 
subjective  manual  procedures  with  subsequent 
adjustments  to  fit  both  procedures  on  the  same 
figure.  However,  fig.  10  displays  a  typical  profile 
of  a  spectrum  complying  with  inequality  (7) 


and,  without  giving  too  much  significance  to  the 
numerical  values  of  the  dimensions,  there  is  no 
doubt  that  they  range  over  a  pretty  broad  range 
(roughly  between  1  and  5).  Accordingly,  we  can 
conclude  that  the  attractor  is  likely  to  be  strongly 
inhomogeneous  because  the  dimensions  of 
sparsely  filled  regions  are  clearly  different  from 
those  of  densely  filled  regions  (see  comment  on 
eq  (7)).  Local  dimensions  have  been  computed 
with  the  purpose  of  assessing  the  inhomogeneity 
of  the  attractor  but  the  set  of  experimental  data 
is  too  small  to  obtain  any  sensible  values  and 
distribution  of  local  dimensions  over  the  at¬ 
tractor. 

A  specific  remark  is  necessary  about  <  2:  in 
principle,  chaos  requires  >  2.  Results  lower 
than  2  likely  are  due  to  finite  resolution  of  data. 
At  smaller  scales  (experimentally  unattainable), 
the  points  should  be  distributed  over  the  attrac¬ 
tor  in  such  a  manner  that  ^  2. 

The  fixed  mass  approach  (see  section  3.2) 
confirms  the  results  of  the  fixed  radius  one.  Fig. 
11  shows  the  results  of  the  fixed-mass  approach 
on  the  same  time  series  and  with  the  same 
parameters  (m  =  10000,  time  delay  25  x  10'^  s 
and  a  space  dimension  of  10).  The  £>(0)  dimen¬ 
sion  is  plotted  as  a  function  of  the  number  N  of 
points,  for  different  values  of  the  number  of 
neighbours. 

The  first  point  to  emphasize  is  the  good 
asymptotic  behaviour  of  the  curves.  Hence,  we 
are  not  obliged  to  take  an  average  value  on  a 
quite  subjective  number  of  points,  and  the  result 
is  more  stable  than  with  the  fixed  radius  method. 

The  second  point  is  that  the  value  of  D(0) 
compares  well  with  D,  computed  previously: 
£>(0)  =  3.8  ±0.1.  Both  methods  thus  agree  about 
a  dimension  of  the  attractor  close  to  4.  Such  a 
pretty  high  dimension  a  priori  may  be  suspected 
because  the  computations  are  sensitive  to  various 
biases.  However,  the  quality  of  our  data  and 
the  coherence  of  our  results  strongly  witness 
for  an  actual  dimension  about  4  (see  following 
section). 
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Fig.  11.  Asymptotic  behaviour  of  D(0)  dimension  function  vs.  number  of  data  points  N .  (Fixed  mass  method.)  Each  curve  is  for  a 
given  number  of  nearest  neighbours  k. 


4.3.  A  plea  for  the  use  of  combined  approaches 

The  physicist  dealing  with  experimental  cha¬ 
otic  systems  as  we  do  here  should  be  convinced 
that  combined  approaches  of  the  experimental 
attractor  are  not  only  useful  but  almost  neces¬ 
sary.  In  thermal  lensing,  owing  to  the  complexity 
of  the  involved  hydrodynamical  properties,  we 
ignore  much  more  than  we  know  about  the 
physics  of  the  experiment.  Thus,  any  single  re¬ 
sult  remains  questionable  and  combined  ap¬ 
proaches  are  the  unique  means  for  assessing  firm 
results.  For  instance,  an  attractor  dimension 
about  4  a  priori  seems  too  high  to  establish  low 
dimensional  chaos  and  this  result  may  raise  the 
scepticism  of  the  reader.  And  so  did  we  look  at 
our  first  dimension  computations.  But,  combin¬ 
ing  further  several  approaches,  we  note  that  the 
fixed  mass  dimension  confirms  the  fixed  radius 
one,  that  the  power  spectra  and  the  3D  embed¬ 
dings  assess  the  quality  of  the  signal  with  low 
noise  level  and  that  the  Poincare  sections  and 
periodic  orbits  show  the  possibility  of  qualitative 
insight  into  the  structure  of  the  attractor.  Iso¬ 
lated,  one  of  these  observations  remains  weak 
but  their  convergence  sets  our  conclusions  to  a 
reasonable  confidence  level. 


Combined  approaches  do  not  only  assess  the 
final  results  that  are  published  in  scientific  pa¬ 
pers,  they  also  enrich  the  researcher’s  feeling 
and  analysis  of  the  dynamical  systems.  Let  us 
emphasize  three  features  that  we  have  ex¬ 
perimented  to  be  significant  but  that  are  only 
poorly  reflected  in  the  static,  frozen,  scientific, 
papers:  (i)  graphical  displays  give  access  to  a 
powerful  global  perception  of  the  results  because 
our  vision  system  is  well  suited  for  catching  and 
analysing  a  large  amount  of  data  (ii)  real-time 
observation  of  displays  reveals  the  chronology  of 
the  phenomenon  as  the  points  lighten  successive¬ 
ly  on  the  screen  (iii)  on-line  implementation  of 
graphical  tools  helps  the  researcher  in  choosing 
and  monitoring  the  parameters  of  the  experiment 
because  it  immediately  shows  the  evolution  of 
the  graphical  display  when  parameters  are 
changed. 

5.  Conclusion 

Heating  just  below  the  free  surface  of  a  liquid 
can  induce  oscillations  and  various  kinds  of 
dynamical  states.  The  heartbeat  experiments  use 
heating  by  a  focused  laser  beam  which  simulta- 
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neously  gives  an  enlarged  picture  of  the  dynami¬ 
cal  system  due  to  thermal  lensing.  Thus  the 
experiment  owns  the  advantages  of  visual  evi¬ 
dence  and  of  optical  based  diagnostics  and  mea¬ 
sures.  We  have  recorded  many  time  series  from 
the  HBl  experiment,  some  of  them  being  cha¬ 
otic.  For  the  analysis  of  the  experimental  time 
series  and  for  the  understanding  of  the  dynami¬ 
cal  system,  the  use  of  multiple  and  complemen¬ 
tary  approaches  is  not  only  desirable  but  almost 
necessary.  Particularly  powerful  are  2D  and  3D 
displays  which  introduce  to  a  global  visual  per¬ 
ception  of  the  behaviour  of  the  system  and  of  its 
evolution  when  the  parameters  are  changed. 
These  graphical  tools  give  their  best  when  they 
allow  permanent  feedback  by  implementation  in 
real  time  processing  of  experimental  data.  Sever¬ 
al  chaotic  attractors  have  been  detected  by  anal¬ 
ysis  of  HBl  time  series  and  one  example  has 
been  reported  in  the  present  paper.  3D  recon¬ 
structions,  power  spectra,  Poincare  sections  have 
assessed  the  presence  of  temporal  chaos.  The 
analysis  of  recurrent  points  has  revealed  some  of 
the  underlying  presumably  unstable  periodic  or¬ 
bits.  The  computations  of  Renyi  dimensions  with 
Grassberger-Procaccia  algorithm  have  been 
done  and  published  both  with  the  fixed  radius 
approach  and  with  the  fixed  mass  one.  They  give 
an  information  dimensions  of  the  attractor  about 
4.  This  value  suggests  that  theoretical  models  of 
HB  experiments  should  involve  at  least  four  or 
five  ordinary  differential  equations.  Other  papers 
report  some  significant  work  which  has  been 
done  to  design  such  theoretical  models  of  the 
HB  system,  reducing  the  actual  system  to  a 
dynamical  system  with  few  degrees  of  freedom 
[21-23].  For  instance,  two  (well-designed)  cou¬ 
pled  ODE’s  suffice  to  approach  the  understand¬ 
ing  of  the  first  Hopf  bifurcation  and  with  three 
coupled  ODE’S,  secondary  instabilities  can  be 
modelled. 

Forthcoming  work  will  aim  at  completing  the 
analysis  of  chaotic  time  series  (entropies, 
Lyapunov  exponents)  and  at  obtaining  similar 
behaviours  with  a  hot  wire  experiment.  Then, 


probably  the  way  will  be  open  to  produce  and 
analyze  spatial-temporal  chaos  in  heated  surface 
instabilities. 
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Rotational  Taylor-Couette  flow  shows  a  rich  variety  of  well-defined  and  controllable  scenarios.  Even  for  short 
cylinders,  where  the  number  of  possible  flow  states  is  found  to  be  very  small,  period  doubling  cascades,  intermittencies, 
homoclinic  orbits  and  n-tori  can  be  found  depending  on  boundary  conditions.  We  characterize  these  flow  states  by  fractal 
dimensions,  Lyapunov  spectra  and  entropies.  In  the  experiments  the  axial  velocity  component  was  measured  using 
laser- Doppler  anemometry  and  the  phase  space  was  reconstructed  by  using  delay  time  coordinates.  A  simple  noise 
reduction  algorithm  was  applied  to  the  noisy  data  set  to  improve  the  accuracy  of  the  estimation  of  the  dynamical  variables. 


1.  Introduction 

The  flow  of  a  viscous  fluid  between  two  rotat¬ 
ing  coaxial  cylinders  has  long  been  one  of  the 
most  prominent  examples  in  the  study  of  hydro- 
dynamic  instabilities  [1].  Though  the  experiment 
is  comparatively  easy  to  perform  and  the  exter¬ 
nal  parameters  can  be  satisfactorily  controlled, 
the  theoretical  calculations  lead  to  exact  solu¬ 
tions  of  the  Navier-Stokes  equation  for  small 
Reynolds  numbers  only  and  if  the  cylinders  are 
considered  to  be  infinitely  long.  Nevertheless 
many  ideas  concerning  hydrodynamic  in¬ 
stabilities  and  the  route  to  chaos  could  be  ex¬ 
amined  experimentally  and  verified  numerically 
[2-12].  For  a  better  understanding  of  the 
Taylor-Couette  flow  one  has  to  know  all  solu¬ 
tions  as  a  function  of  their  location  in  the  control 
parameter  space.  This  requires  the  analysis  of 
many  points  in  the  parameter  space.  For  the 
estimation  of  dynamic  variables  like  fractal  di¬ 
mensions  and  Lyapunov  spectra,  which  can 
handle  noisy  data  sets  restricted  in  resolution 
and  number,  one  needs  efficient  algorithms. 


To  limit  the  number  of  solutions  we  focus  on 
Taylor-Couette  flow  with  short  cylinders  where 
this  number  of  solutions  is  assumed  to  be  moder¬ 
ately  small  ([9]  and  papers  cited  therein).  Even 
with  this  restricted  configuration,  we  can  show 
that  there  is  no  universal  route  to  chaos  for 
Taylor-Couette  flow.  Instead  there  exists  a  rich 
variety  of  scenarios.  In  this  paper  we  present 
experimental  results  of  three  of  them:  period 
doubling,  n-torus  and  intermittency.  The  inter¬ 
pretation  of  the  results  shows  the  necessity  of 
noise  reduction  methods  and  we  present  a  simple 
algorithm  implemented  by  us. 

2.  Flow  apparatus  and  measuring  techniques 

The  inner  cylinder  of  the  Taylor-Couette  ap¬ 
paratus  used  in  the  experiments  was  machined 
from  stainless  steel  having  a  radius  of  r^  = 
12.5  mm,  while  the  outer  cylinder  was  made 
from  optical  polished  glass  with  a  radius  of  Tj  = 
25  mm.  The  accuracy  of  the  radii  is  better  than 
0.01  mm  over  the  entire  length  of  640  mm.  The 
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flow  was  confined  between  a  rotating  inner  cylin¬ 
der  and  a  stationary  outer  cylinder  and  station¬ 
ary  bottom  and  top  plates.  The  length  of  the  gap 
could  be  varied  continuously  by  moving  the 
metal  collar  which  provides  the  top  surface  of 
the  flow  domain.  The  aspect  ratio  F  used  as  a 
geometrical  control  parameter  is  defined  as  the 
ratio  of  gap  height  to  gap  width.  We  used  silicon 
oil  as  the  working  fluid  with  different  viscosities 
depending  on  the  flow  situation.  Working  at  very 
low  gap  height  to  width  ratios,  it  turned  out  that 
even  small  dirt  particles  (<1  mm)  in  the  fluid  can 
perturb  the  flow  in  some  cases.  So  the  fluid  was 
cleaned  with  a  filter  having  a  mesh  size  of  5  p-m. 

The  external  control  parameter  is  the 
Reynolds  number  defined  as  Re  =  (/2  d  r^)lv, 
where  FI  is  the  angular  frequency  of  rotation  of 
the  inner  cylinder,  d  =  r2  -  r,  the  gap  width  and 
V  the  kinematic  viscosity.  The  temperature  of  the 
fluid  was  kept  constant  to  within  0.01  K  by  circu¬ 
lating  thermostatically  controlled  silicon  oil 
through  a  surrounding  square  box.  A  phase- 
locked  loop  circuit  controlled  the  speed  of  the 
inner  cylinder  to  better  than  one  part  in  10  "  in 
the  short  term  and  better  than  one  part  in  10” 
in  the  long  term  average.  Thus  the  accuracy  of 
the  absolute  value  of  the  Reynolds  number  was 
about  1%  and  for  relative  changes  better  than 
10”^  The  local  velocity  was  measured  by  a  real 
fringe  laser-Doppler-anemometer  and  recorded 
by  a  phase-locked-loop  analogue  tracker.  Ac¬ 
cording  to  the  statistics  of  scattering  particles  the 
signal  shows  Doppler-phase  noise.  The  analogue 
output  voltage  of  the  tracker,  which  is  propor¬ 
tional  to  the  local  velocity  component,  was  fil¬ 
tered  by  an  analogue  Bessel  filter  of  fourth  order 
with  a  cutoff  frequency  depending  on  the  flow 
situation  and  the  highest  significant  frequency  of 
the  system.  The  velocity  signal  was  fed  into  an 
AD-converter  with  10  or  12  bit  resolution  and 
then  into  a  computer  where  the  data  processing 
was  performed.  For  more  details  of  the  ex¬ 
perimental  setup  see  [9-11]. 

The  first  step  in  analyzing  the  time  series 
obtained  is  the  reconstruction  of  the  attractor  by 


delay  time  coordinates.  New  methods  enable  us 
to  choose  a  proper  delay  time  and  a  sufficiently 
large  embedding  dimension.  Data  sets  with  a 
high  noise  level  are  treated  by  a  noise  reduction 
method  described  below.  The  received  data  sets 
and  those  with  a  low  initial  noise  level  can  now 
be  characterized  by  classical  analysis  (power 
spectra  and  autocorrelation  function)  and  mod¬ 
ern  techniques  yielding  fractal  dimensions, 
Lyapunov  spectra  and  entropies. 

2.1.  Bifurcations  of  Taylor-Couette  flow 

The  basic  flows  considered  in  this  paper  are 
illustrated  in  the  upper  part  of  fig.  1  where  the 
flow  pattern  in  a  radial  plane  of  the  two  flow 
types  existing  for  aspect  ratios  r^l.2  are 
shown,  i  and  o  indicate  inner  and  outer  cylinder, 
respectively.  The  two  cell  flow  (s  for  symmetric) 
is  mirror  symmetric  relative  to  the  midplane  with 
a  maximum  outward  flow  velocity  in  this  plane 
and  a  slower  reflux  at  top  and  bottom  plate.  For 
this  flow  mode  the  axial  velocity  component  v,  in 
the  midplane  is  zero  for  all  Reynolds  numbers. 
The  single-cell  flow  (a  for  asymmetric)  appears  in 
two  equivalent  modes  with  one  big  vortex  and  a 
small  weak  one  near  the  bottom  or  top  plate. 


Fig.  1.  (a)  Flow  patterns  for  the  symmetric  two-vortex  state 
(s)  and  the  single  vortex  state  (a, )  drawn  after  a  visualization 
experiment,  (b)  corresponding  bifurcation  diagram. 
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respectively.  The  vortices  have  contrary  direc¬ 
tions  of  rotation.  In  contrast  to  the  symmetric 
mode,  the  axial  velocity  component  of  the  single 
vortex  mode  is  non-zero  in  most  locations  in  the 
midplane.  With  a  proper  position  of  the  mea¬ 
surement  volume  it  is  possible  to  characterize  the 
actual  flow  mode  by  measurement  of  a  single 
velocity  component.  The  approximate  location 
of  the  measurement  volume  is  indicated  by  X  in 
the  schematic  representation  of  the  flow  pattern. 

The  lower  part  of  fig.  1  shows  a  bifurcation 
diagram  of  these  two  flow  modes  at  an  aspect 
ratio  of  r  =  0.47.  For  this  measurement  the 
Reynolds  number  was  increased  at  a  quasistatic 
rate  from  Re  =  200  to  Re  =  600  and  was  re¬ 
corded.  This  procedure  was  repeated  for  all 
branches. 

For  Reynolds  numbers  smaller  than  320  the 
axial  velocity  component  is  zero,  indicating  the 
symmetric  two  vortex  state.  The  branch  of  this 
flow  mode  is  marked  with  S.  At  point  A  a 
supercritical  bifurcation  appears:  the  symmetric 
solution  loses  stability  and  the  two  asymmetric 
modes,  marked  with  a,  in  the  diagram,  appears. 
At  this  point  A  a  critical  slowing  down  takes 
place,  i.e.  the  time  constant  diverges,  so  special 
care  has  to  be  take  while  passing  the  point.  This 
bifurcation  is  disconnected  due  to  very  small 
imperfections  in  the  apparatus  and  thus  there  is  a 
smooth  development  of  one  of  the  single-cell 
flows.  The  other  branch  can  be  reached  by  sud¬ 
den  starts  or  finite  perturbations  of  the  flow. 
Those  branches  are  marked  with  an  asterisk. 
Following  the  branches  of  the  two  single-cell 
modes  to  higher  Reynolds  numbers  a  Hopf  bifur¬ 
cation  occurs  (marked  with  G). 

The  time  dependent  flow  modes  can  either  be 
an  m  =  2  mode  (m  being  the  azimuthal  wave 
number)  or  an  /n  =  3  mode,  depending  on  the 
aspect  ratio  F. 

The  symmetric  solution  which  loses  stability  at 
point  A  restabilizes  at  point  B,  starting  a  sec¬ 
ondary  symmetric  branch  S*.  This  branch  can  be 
obtained  experimentally  only  by  a  sudden  jump 
from  the  stable  symmetric  solution  (S)  to 


Reynolds  numbers  Re  above  the  critical  value  B. 
For  higher  Reynolds  numbers  this  branch  shows 
a  Hopf  bifurcation  at  point  C. 

Depending  on  slight  changes  of  flow  geometry 
and  aspect  ratio  the  time  periodic  flow  starting  at 
B  can  show  quite  different  scenarios  on  the  route 
to  chaos.  (As  an  example  a  period  doubling 
cascade  will  be  shown  below.) 

The  basis  of  a  full  understanding  of  this  flow  is 
the  knowledge  of  all  critical  points.  In  fig.  2  we 
plot  all  known  critical  points  as  a  function  of  the 
control  parameters,  the  Reynolds  number  and 
the  aspect  ratio  F,  taken  from  measurements  and 
numerical  calculations  [9]. 

For  this  T-Re  plot  hundreds  of  bifurcation 
diagrams  like  the  one  shown  in  fig.  1  had  to  be 
recorded,  most  of  them  by  ramping  the  Reynolds 
number  in  both  directions  to  detect  hysteresis 
effects. 

The  solid  lines  marked  with  G  indicate  the  loci 
of  Hopf  bifurcations,  evolving  from  the  station- 


Fig.  2.  Stability  diagram  in  Re-T  plane.  Dashed-dotted 
lines:  region  of  steady  solutions,  solid  lines:  regime  of  time 
dependent  asymmetric  solutions  and  dashed  line:  regime  of 
intermittent  solutions. 
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ary  single  cell  mode.  For  aspect  ratios  F  <  0.8  an 
oscillatory  state  with  an  azimuthal  wave  number 
m  =  3  occurs,  for  T  ^  0.8  an  m  =  2  mode.  This 
m  =  2  mode  is  of  particular  interest,  because  it 
shows  a  hysteresis  region  marked  by  H  and  two 
lines  of  homoclinity,  marked  with  HO,  and 
HOj,  respectively.  This  gives  a  band  of  station¬ 
ary  flow  which  extends  up  to  Reynolds  number 
Re  =  1400.  Approaching  this  band  from  both 
sides  in  the  T-Re  plane,  the  frequency  of  the 
m  =  2  mode  goes  to  zero  while  a  constant  am¬ 
plitude  is  preserved  (details  can  be  found  in 
[13]).  The  development  of  these  time  periodic 
states  towards  higher  Reynolds  numbers  depends 
strongly  on  the  flow  mode.  Even  for  the  same 
basic  flow  we  find  quite  a  lot  of  scenarios  de¬ 
pending  on  the  aspect  ratio  F.  The  work  on  this 
topic  has  been  started  recently,  so  we  present 
here  exemplarily  three  scenarios;  quasiperiodici¬ 
ty,  period  doubling  and  intermittency. 


where  5E5  and  =  ^ -  (t/T^) 

(dim^  -  1)},  dim^  is  the  embedding  dimension,  t 
is  the  delay  time  and  7^  the  sampling  time.  For 
convenience  we  write  jc,  instead  of  The 

choice  of  a  proper  delay  time  and  a  sufficiently 
large  embedding  dimension,  which  is  not  known 
a  priori,  is  not  trivial  for  time  series  restricted  in 
resolution  and  number  of  data  points.  Recently 
we  developed  two  methods  to  calculate  optimal 
embedding  parameters.  The  first  algorithm  yields 
a  global  statical  measure,  called  fill  factor,  which 
gives  an  estimate  of  the  phase  space  utilization 
for  (quasi-)periodic  and  strange  attractors  and 
leads  to  a  maximum  distance  of  trajectories  [15], 
The  second  algorithm,  the  integral  local  de¬ 
formation,  describes  the  local  dynamical  be¬ 
haviour  of  points  on  the  attractor  and  gives  a 
measure  of  the  homogeneity  of  the  local  flow 
[16].  A  comparison  of  different  algorithms  is 
presented  in  [17]. 


3.  Rt'constniction  of  phase  space  from  scalar 
time  series 


T.ie  analysis  of  chaotic  time  series  requires  a 
careful  application  of  methods  which  yield  fractal 
dimensions,  Lyapunov  exponents  and  entropy  of 
attractors  in  phase  space.  In  many  experimental 
situations  only  a  few  of  the  system’s  observables 
can  be  obtained. 

reconstruct  the  phase  space  from  states  of 
the  3  aylor-Couette  experiment  where  only  one 
comp  onent  of  the  local  velocity  is  measured  by 
the  use  of  Takens’  delay  time  coordinates  [14]. 

)}  with  /c  =  0, .  .  .  ,  Ajg,  -  1  is  the  scalar 
time  series,  is  the  number  of  sampled  data 
point'.  Vectors  in  the  embedding  space  are  given 
by 


+  r) 


(1) 


+  T(dimE  -  1))/ 


4.  Characterization  of  time  series 


To  characterize  chaotic  time  series  it  is  often 
insufficient  to  calculate  power  spectra  or  au¬ 
tocorrelation  functions,  only.  Recently  more 
powerful  tools  have  been  developed.  With  the 
results  of  these  algorithms  one  can  distinguish 
between  different  chaotic  states  of  a  nonlinear 
system. 


4.1.  The  fractal  dimension 


A  useful  parameter  characterizing  the  geome¬ 
try  of  strange  attractors  is  the  fractal  dimension. 
For  the  results  presented  here  we  calculated  the 
correlation  dimension  [18] 
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where  C{R)  is  the  correlation  integral 
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where  R:  scaling  radius,  number  of  refer¬ 
ence  points  and  a:  Heaviside  function. 

For  inhomogeneous  attractors  it  is  often  useful 
to  calculate  the  distribution  of  local  or  pointwise 
dimensions 


R-O  log(/?) 


(4) 


where  C\R)  is  the  local  correlation  integral 
C\R):  =  —  S  (5) 

”dat  /  =  0 

4.2.  The  Lyapunov  exponents 


The  most  significant  parameter  to  detect  and 
distinguish  between  chaotic  dynamics  is  the  spec¬ 
trum  of  Lyapunov  exponents  (LE) 


where  p*(0)  is  the  radius  of  the  dim^-sphere  at 
the  starting  point  and  the  principal 

axes  of  the  dimE-ellipsoid  at  some  evolution  time 
/gy.  The  LE  describe  the  sensitivity  of  a  system  to 
small  variations  of  initial  conditions.  So  LE  give 
an  estimate  of  the  “strength  of  chaoticity”.  A 
review  of  the  algorithm  we  used  is  given  in  [19]. 

In  an  experiment  a  dim^-dimensional  linear¬ 
ized  flow  map  has  to  be  approximated.  Strong 
efforts  are  made  to  identify  true  and  spurious  LE 
when  the  embedding  dimension  is  larger  than  the 
number  of  relevant  degrees  of  freedom.  Some 
algorithms  are  based  on  the  observation  that  the 
true  LE  change  their  signs  upon  time  reversal 
whereas  the  spurious  exponents  do  not  [20]. 
Other  algorithms  use  singular  value  decomposi¬ 
tion  techniques  to  restrict  the  linearized  flow 
map  to  the  relevant  subspace  [21]. 


4.3.  The  entropy 

The  entropy  of  a  system’s  state  is  the  essential 
measure  of  chaoticity,  the  average  loss  of  infor¬ 


mation.  Closely  related  to  the  LE,  the  metric 
entropy  for  homogeneous  attractors  [22]  is  given 
by 

A  =  S  a;  ,  (7) 

k 

where  A*  denote  the  positive  exponents.  The 
order-two  Kolmogorov  entropy  or  correlation 
entropy  [22]  is  given  by 

K^  =  -  lim  lim  -  log  -  (8) 

r-0  T 

where  denotes  the  correlation  integral  in 

dirng-dimensional  embedding  space.  K2  can  be 
estimated  from  the  slope  of  the  logarithm  of  the 
ratio  of  successive  correlation  integrals  versus 
delay  time  at  the  accumulation  line  for  higher 
dimg  and  small  t  [17].  For  further  practical 
estimates  of  K,  see  [23]. 


5.  Noise  reduction 

Experimental  attractors  from  the  Taylor- 
Couette  experiment  are  always  noisy,  due  to  the 
Doppler-phase  noise  on  the  LDV  signal.  This 
effect  is  caused  by  the  arrival  statistics  of  the 
light  scattering  particles. 

Recently  many  efforts  have  been  made  to  find 
the  underlying  noiseless  attractor  [24-27]  to  en¬ 
sure  accurate  calculations  of  fractal  dimensions 
and  Lyapunov  spectra.  It  is  well  known  that 
using  filters  in  the  time  domain  can  affect  the 
values  of  the  dynamical  variables  [28,  29]. 

The  new  noise  reduction  method  presented 
here  eliminates  high  frequency  noise  while  pre¬ 
serving  the  original  dynamics.  The  noise  reduced 
attractor  enables  us  to  estimate  the  dynamical 
variables  more  precisely.  We  tested  our  new 
approach  on  experimental  (Taylor-Couette)  as 
well  as  on  numerical  (Duffing  oscillator)  time 
series. 
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5.1.  The  noise  reduction  method 

The  method  presented  here  uses  the  average 
flow  vector  instead  of  an  approximation  of  the 
linearized  flow  map.  We  define  a  local  average 
displacement  vector  /(x,)  as  the  difference  be¬ 
tween  the  center  of  mass  of  a  cloud  of  neigh¬ 
bouring  points  at  the  reference  point  and  the 
center  of  mass  of  the  same  ensemble  one  time 
step  later.  Considering  a  window  of  p  points 
one  obtains  new  points 
by  minimizing  the  sum 

*=o 

The  minimization  satisfies  the  following  two  con¬ 
ditions; 

-  The  new  points  must  be  reconstructable 
from  a  time  series. 

-  Only  those  coordinates  are  varied,  which  are 
independent  from  the  points  outside  the  window. 

The  first  term  of  expression  (9)  describes  the 
deviation  from  the  averaged  direction,  the  sec¬ 
ond  term  describes  the  distance  from  the  old 
data  points.  The  attractor  will  be  noise  reduced 
successively  for  all  data  points.  The  whole  proce¬ 
dure  is  repeated  until  the  noise  level  is  reduced 
below  a  given  threshold.  This  procedure  works 
only  if  the  number  of  data  points  per  mean 
period  of  the  attractor  is  sufficiently  high.  The 
algorithm  is  comparatively  easy  to  implement 
even  on  small  computers.  The  work  on  this  noise 
reduction  algorithm  and  a  comparison  with  other 
methods  is  still  in  progress. 

5.2.  Test  of  the  noise  reduction  method 

Below  we  present  the  results  of  our  method 
for  a  quasiperiodic  time  series  from  Taylor- 
Couette  flow  and  for  a  numerical,  chaotic  time 
series  (Duffing  oscillator:  x  +  Dx  +  x  +  = 

Fcosiot,  where  D  =  0.2,  F  =  40  and  w  =  1).  The 
number  of  points  per  mean  cycle  is  78  for  the 
experimental  time  series  and  126  for  the  numeri¬ 
cal  data. 

To  demonstrate  the  noise  reduction  scheme 


Fig.  3.  Reconstructed  attractors,  Poincare  sections  and  circle 
maps  of  the  experimental  quasiperiodic  time  series  (a)  5(K)  Hz 
low  pass  filtered,  (b)  noise  reduced. 

we  applied  this  algorithm  to  the  quasiperiodic 
attractor.  Fig.  3a  shows  the  phase  space  recon¬ 
struction,  Poincare  section  and  circle  map  of  a 
torus  that  was  low  pass  filtered  with  a  cut  off 
frequency  of  =  500  Hz,  which  is  above  100 
times  higher  than  the  fundamental  frequency  in 
the  system.  Fig.  3b  shows  the  same  state  after 
the  noise  reduction  procedure. 

The  power  spectra  for  these  two  cases  shown 
in  fig.  4  demonstrate,  that  the  noise  between  the 
two  frequencies  is  reduced  while  the  intensity 
ratio  is  left  unchanged.  In  fig.  5  the  slope  of  the 
double  logarithmic  plot  of  the  correlation  inte¬ 
gral  versus  scaling  radius  is  shown  for  both  cases. 
The  no^e  reduction  scheme  yields  a  proper  esti¬ 
mate  of  the  correlation  dimensions. The  expected 
value  of  D2  for  the  torus  is  indicated  by  dotted 
lines.  The  noise  reduction  scheme  was  applied 
directly  to  the  data  set  shown  in  fig.  3a,  whereas 
the  best  procedure  is  to  start  with  a  proper 
setting  of  the  analogue  filter. 

In  order  to  show  that  the  method  works  with 
chaotic  attractors  as  well,  we  added  about  4% 
white  noise  to  the  numerical  time  series  from  the 
Duffing  system.  The  number  of  data  points  is 
yVj3,  =  32  768  at  a  resolution  of  10  bit.  To  com¬ 
pare  the  quality  of  the  noise  free,  noise  covered 
and  noise  reduced  time  series  we  estimated  the 
first  two  LE  for  embedding  dimensions  up  to 
dim^  =  12.  The  LE  were  estimated  by  calculating 
the  linearized  flow  map.  Fig.  6  illustrates  that  the 


Fig.  6.  The  non-negative  Lyapunov  exponents  versus  embedding  dimension  for  the  noise  free  (dashed  lines),  noise  covered 
(dashed  dotted  lines)  and  noise  reduced  Duffing-attractor  (dotted  lines).  The  horizontal  dashed  lines  indicate  the  correct  values 
for  the  first  two  Lyapunov  exponents. 
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Table  1 


Lyapunov  spectra  for  the  noise  free,  noise  covered  and  noise  reduced  Duffing  attractor  in  comparison  with  the  theoretical  values. 


Noise  free 
time  series 

Noise  covered 
time  series 

Noise  reduced 
time  series 

Theor. 

A, 

0.15  ±0.01 

0.40  ±0.11 

0.16  ±0.02 

0.16 

Aj 

0.00  ±  0.01 

-0.65  ±  0.34 

-0.04  ±  0.02 

0.(X) 

A, 

-0.51  ±0.05 

-5.70  ±  1.40 

-0.57  ±0.10 

-0.45 

results  obtained  from  the  noise  reduced  time 
series  agree  with  the  noise  free  data.  The  dotted 
lines  indicated  the  first  two  exponents  calculated 
directly  from  the  known  linearized  flow  map. 
The  exponents  are  hardly  to  be  estimated  from 
the  noise  covered  time  series.  In  table  1  the  first 
three  LE  are  shown  for  the  different  time  series. 
We  averaged  the  relevant  exponents  for  dim^  ==  4 
to  dimg  =  12.  The  given  errors  are  the  statistical 
inaccuracy  of  the  means. 


6.  Scenarios 

In  this  section  we  exemplarily  present  ex¬ 
perimental  results  from  the  Taylor-Couette 
flow.  The  goal  is  to  show  the  variety  of  scenarios 
and  how  the  algorithms  work  when  applied  to 
the  given  experiment. 

6.1.  Period  doubling 

The  Hopf  bifurcation  evolving  from  the  re¬ 
stabilized  symmetric  two  cell  state,  indicated 
with  line  C  in  fig.  2,  can  show  quite  different 
scenarios  on  the  route  to  chaos  depending  on 
slight  variations  of  the  aspect  ratio.  The  most 
remarkable  scenario  is  a  period  doubling 
cascade. 

Figs.  7a  to  7d  shows  the  fundamental  limit 
cycle  and  three  period  doubling  bifurcations 
leading  to  8  times  the  fundamental  period.  For 
Reynolds  numbers  larger  than  the  accumulation 
point  the  flow  becomes  chaotic  (7e),  then  shows 
restabilization  (7f)  for  higher  Reynolds  numbers. 
For  this  scenario  we  calculated  the  correlation 


dimension  (circles),  the  mean  pointwise  dimen¬ 
sion  (diamonds)  and  the  Kaplan-Yorke  dimen¬ 
sion  (squares)  shown  in  fig.  8a.  As  expected,  the 
dimensions  increase  rapidly  to  values  larger  than 
2,  above  the  critical  Reynolds  number  568  we 
found  experimentally  the  period-3  window, 
which  agrees  with  the  simple  logistic  model.  The 
dimensions  are  found  to  be  1.4  instead  of  1, 
which  we  think  is  caused  by  noise  and  in  addition 
to  having  not  met  the  correct  Reynolds  number 
for  the  periodic  window. 

Fig.  8b  shows  the  Lyapunov  spectra  in  the 
same  range  of  the  Reynolds  number  (circles, 
squares  and  diamonds  are  the  first  three  expo- 


Fig.  7.  Phase  space  reconstruction  of  a  period  doubling 
cascade. 
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Fig.  8.  (a)  Different  dimensionalities  for  a  period  doubling 
cascade  versus  Reynolds  number,  (b)  Lyapunov  spectra  and 
metric  entropy  versus  Reynolds  number.  The  dotted  line 
shows  the  first  Lyapunov  exponent,  the  given  uncertainties 
are  the  standard  deviations  of  the  averaging  process. 

nents,  respectively,  the  full  circles  are  the  values 
of  the  metric  entropy).  This  result  is  in  good 
agreement  with  the  plot  above.  The  error  bars 
are  estimated  from  the  standard  deviation  of  the 
averaging  process  over  several  embedding  di¬ 
mensions.  This  analysis  shows  that  the  finite 
resolution  of  these  measurements  allows  only  the 
detection  of  the  largest  periodic  window  in  this 
scenario. 

6.2.  Quasiperiodic  sequence 

The  Hopf  bifurcation  occurring  at  the  restabil¬ 
ized  two  vortex  flow  can  show  quasiperiodic 
behaviour  for  different  values  of  aspect  ratio  F. 


Fig.  9.  Reconstructed  attractors.  Poincare  sections  and  circle 
map  for  a  quasiperiodic  sequence. 

We  find  this  scenario  also  in  the  evolution  of 
m  =  2  and  m  =  3  modes  and  often  in  scenarios  at 
larger  aspect  ratios  with  many  vortices.  The 
break-up  of  a  two-torus  in  Taylor-Couette  flow 
was  the  first  scenario  which  was  analyzed  by  A. 
Brandstatter  and  H.  Swinney  [6]  using  modern 
nonlinear  techniques. 

Fig.  9  shows  the  evolution  of  a  quasiperiodic 
state  towards  higher  Reynolds  numbers.  The 
three  columns  show  the  reconstructed  phase 
space  with  the  orientation  of  the  Poincare  plane, 
the  Poincare  section  and  the  corresponding  circle 
map,  respectively.  In  contrast  to  the  Ruelle- 
Takens-Newhouse  scenario  which  was  analyzed 
in  [6]  we  found  a  stable  three-mode  state.  From 
the  Poincare  sections  and  circle  maps  shown  in 
fig.  9d  and  9f  we  cannot  decide  whether  this  state 


450 


G.  Pfister  el  at.  /  Characterization  of  experimental  time  series 


is  chaotic  or  not.  Therefore  we  estimate  the 
dimension  and  entropy  of  the  state  shown  in  fig. 
9d.  In  fig.  10  the  evaluation  of  the  dimension  is 
shown.  Fig.  10a  gives  the  local  slopes  of  the 
double  logarithmic  plot  of  the  correlation  inte¬ 
gral  versus  log(/?).  We  find  a  plateau  at  about 
R  =  4%  and  estimated  the  correlation  dimension 
to  be  Z)2*=3.2.  In  fig.  10b  the  distributions  of 
pointwise  dimensions  are  shown;  then  indicate  a 
homogeneous  attractor.  The  mean  pointwise  di¬ 
mension  Dp^  agrees  with  the  correlation  di¬ 
mension. 

We  want  to  point  out  that,  although  the  circle 


Fig.  10.  (a)  local  slopes  of  the  double  logarithmic  correction 
integral  versus  logarithmic  radius  (given  in  percent  of  the 
total  extension  of  the  attractor),  (c)  Distribution  of  pointwise 
dimension  for  dim^  =  12  showing  a  homogeneous  attractor 
yielding  *  3.2. 


map  for  case  (9d)  suggests  non-invertibility  and 
D,  =  3.2,  we  found  that  this  state  is  a  three- 
torus.  In  fig.  11a  the  logarithm  of  the  ratio  of 
successive  correlation  integrals  is  plotted  versus 
normalized  delay  time  as  suggested  by  eq.  (8). 
From  a  fit  of  the  accumulation  line  one  estimates 
/C,  =  0  bits/orbit.  In  fig.  11b  the  calculation  of 
the  correlation  entropy  is  plotted  versus  embed¬ 
ding  dimension. 

Fig.  12  gives  the  estimated  dimensions  for  the 
whole  quasiperiodic  scenario.  The  arrows  mark¬ 
ed  with  a,b,d  and  e  correspond  to  the  states 
shown  in  fig.  9.  At  a  Reynolds  number  Re  =  550, 
marked  with  a,  the  flow  shows  a  second  Hopf 
bifurcation  and  the  limit  cycle  with  a  dimension 
of  about  1  (experimentally)  changes  to  a  two- 
torus  with  a  dimension  of  about  2.  At  Reynolds 
numbers  about  580,  marked  with  b  in  fig.  12, 
another  Hopf  bifurcation  occurs,  giving  a  three- 
torus.  The  flow  modes,  showing  three  fundamen¬ 
tal  frequencies,  are  physically  quite  different  and 
we  did  not  find  any  phase  locking.  (Recently  we 
found  phase  locking  of  similar  flow  modes  in  a 
twelve-vortex  state,  results  will  be  published 
elsewhere.)  The  frequencies  of  the  three  modes 
are  functions  of  the  Reynolds  number.  So  the 
modes  are  not  incommensurate  for  all  Reynolds 
numbers.  W'e  caught  only  the  ratio  ^ , 

where  the  dimension  is  decreased  by  one. 

The  estimated  dimensions  are  systematically 
too  high,  which  we  think  is  due  to  the  noise  level 
of  the  experiment.  This  effect  shows  the  need  for 
efficient  noise  reduction  algorithms.  Noise  reduc¬ 
tion  has  not  yet  been  applied  to  the  data  for  the 
considered  range  in  fig.  12. 

6.3.  Inlermittency  near  a  homoclinic  orbit 

As  shown  in  fig.  2,  Taylor-Couette  flows  for 
aspect  ratios  1.05  <  T  <  1.35  show  a  Hopf  bifur¬ 
cation,  which  evolves  from  the  single  cell  state  at 
Reynolds  numbers  Re  >  9(X).  The  flow  mode 
with  an  azimuthal  wave  number  m  =  2  is  inter¬ 
sected  in  the  Re-T-plane  by  two  lines  of  homo- 
clinity.  This  effect  will  be  described  elsewhere. 
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Fig.  11.  (a)  logarithmic  ratio  of  successive  correlation  integrals  versus  normalized  delay  time  (T^  is  mean  recurrence  time).  The 
correlation  entropy  is  estimated  to  be  0  bits/orbit,  (b)  Correlation  entropy  versus  embedding  dimension  estimated  from  fig. 
10a  for  embedding  dimensions  2  to  8  in  the  delay  interval  t/T^  =0. 1-0.4.  The  errors  are  the  standard  deviations. 


Fig.  12.  Correlation  dimension  and  averaged  pointwise  di¬ 
mension  for  the  quasiperiodic  sequence.  At  the  arrows  a  and 
b  new  incommensurate  modes  appear,  at  arrow  e  two  modes 
become  commensurate  (ratio  of  1:2). 


We  focus  on  a  scenario  occurring  for  values  of  F 
larger  than  1.1  and  Re=l5()0.  Figs.  13a-13c 
show  the  time  series,  reconstructed  attractors 
and  the  distribution  of  pointwise  dimensions  for 
r=1.2  and  three  Reynolds  numbers.  The  em¬ 
bedding  dimension  is  dimp  =  12  in  all  cases.  Fig. 
13a  shows  a  noisy  limit  cycle  at  Re  =  1400,  in  fig. 
pb  intermittent  behaviour  is  shown.  As  can  be 
seen  from  the  reconstructed  phase  space,  the 
kind  of  the  limit  cycle  does  not  change.  Also  a 
high  dimensional  region  with  an  approximate 
fractal  dimension  of  6  occurs.  For  Reynolds 
numbers  Re  =  1800  the  laminar  phase  disappears 
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(a)  ( t  ♦  2x  ) 

100  300  (/sec  ^  2  6  Dp„ 


Fig.  13.  Time  series,  reconstructed  attractor  and  distribution  of  pointwise  dimension  for  (a)  Re  =  1400,  noisy  limit  circle,  (b) 
Re  =  1658,  intermittency  and  (c)  Re  =  1800,  chaotic  state. 

and  only  the  high  dimensional  part  of  the  attrac-  P{T)^T  A  rough  estimate  gives  a  value  on 

tor  remains,  showing  the  same  fractal  dimension  the  order  of  a  ~  1  |3()]. 

as  before.  The  mean  length  {  T)  of  the  laminar  phases  as 

For  a  better  understanding  of  this  intermitten-  a  function  of  Reynolds  numbers  is  shown  figure 

cy  we  recorded  the  distribution  of  the  length  of  14b.  The  solid  line  is  a  fit  of  the  measurement 

laminar  phases,  seen  in  fig.  14a.  For  this  plot  the  points  to  the  function  ( T)  =  fl(c)  where  a  = 

length  of  500  laminar  phases  were  recorded  with  7.236.  This  is  the  expected  scaling  law  with 

a  resolution  of  1  second.  (The  period  of  the  limit  e  =  Re  -  Re,,  and  Re,  =  1652.  From  the  ex¬ 
cycle  was  in  the  order  of  1  second,  which  limits  perimental  data  we  can  only  exclude  type  I  and 

the  accuracy  of  the  time  estimate.)  The  statistics  type  HI  intermittency.  Further  investigations  will 

are  not  sufficient  to  estimate  an  exact  value  for  be  done  to  decide  whether  the  intermittency  is  of 

the  exponent  of  the  expected  distribution  type  11  or  a  new  kind. 
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a) 


Fig.  14.  Distribution  of  laminar  flow  P(T)  versus  duration  T 
for  Re  =  1658.  (a)  resolution  of  1  s,  (b)  mean  duration  of  the 
laminar  flow  versus  Reynolds  number. 


7.  Conclusion 

One  motivation  to  find  efficient  algorithms  for 
the  analysis  of  chaotic  time  series  is  to  character¬ 
ize  all  states  of  a  nonlinear  system  in  a  given 
control  parameter  space.  In  this  paper  we 
showed  that  in  Taylor-Couette  flow  a  rich  varie¬ 
ty  of  scenarios  CAiaiS  even  for  a  very  limited 
geometr, .  Starting  from  various  stationary  and 
periodic  states  we  investigated  a  period  doubling 
cascade,  a  3-torus  and  an  intermittent  scenario. 
Tlie  experimental  results  show  that  noise  reduc¬ 
tion  is  necessary  to  obtain  the  required  accuracy 
in  the  estimation  of  dynamical  variables. 
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The  method  of  singular  systems  analysis  (SSA)  is  applied  to  the  phase  portrait  analysis  of  high  precision  time  series  of 
measurements  of  temperature  and  total  heat  transport  obtained  in  a  high  Prandtl  number  (Pr  =  26)  fluid  contained  in  a 
rotating,  cylindrical  annulus  subject  to  a  horizontal  temperature  gradient.  Global  SSA  is  found  to  be  a  highly  effective 
means  of  separating  weak  high  frequency  oscillatory  components  from  a  signal  dominated  by  strong  oscillations  at  low 
frequency,  revealing  the  possible  presence  of  weak  inertia-gravity  oscillations  within  a  large-scale  baroclinic  wave  flow. 
SSA  embeddings  may  fail  in  practice,  however,  for  signals  modulated  in  amplitude  where  the  carrier  and  modulation 
frequencies  differ  widely,  unless  a  sufficiently  long  window  is  used.  A  modification  of  global  SSA  with  short  windows, 
consistent  with  Takens’  ‘Method  of  Delays’,  is  proposed  and  verified  using  data  from  a  chaotic  regime  of  the  rotating 
annulus,  which  may  improve  SSA  embeddings  for  such  cases.  Local  SSA  is  applied  to  the  estimation  of  topological 
dimension  in  quasi-periodic  and  chaotic  flows,  and  produces  results  consistent  with  more  conventional  dimension 
measures. 


1.  Introduction 

In  characterising  complex  behaviour  in  the 
context  of  low  dimensional  chaos,  the  theory  of 
dynamical  systems  has  emphasised  the  geometric 
and  topological  properties  of  a  system’s  be¬ 
haviour  as  viewed  in  its  phase-space.  This  ap¬ 
proach  was  given  further  impetus  with  the  pro¬ 
posal  [1]  that  essential  aspects  of  the  time  be¬ 
haviour  of  a  dynamical  system  could  actually  be 
reconstructed  from  a  suitable  set  of  measure¬ 
ments  in  a  K-dimensional  state  space  {K  integer) 
constructed  using  time  delays  (hereafter  the 
‘method  of  delays’  or  MOD).  This  approach  was 
originally  proposed  as  a  generic  analysis  tool 
applicable  to  virtually  any  dataset.  In  practice, 
however,  results  obtained  are  found  to  be  sensi¬ 
tive  to  factors  such  as  the  delay  timescale  r  and 
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assumed  embedding  dimension  K,  and  to  mea¬ 
surement  noise.  Consideration  of  these  factors 
led  Broomhead  and  King  ([2],  hereafter  BK) 
and  Fraedrich  [3]  to  propose  the  application  of 
singular  system  analysis  (SSA),  a  technique  de¬ 
veloped  from  the  information  theory  of  Pike  et 
al.  [4]  and  related  to  the  Karhunen-Loeve  trans¬ 
formations  used  in  statistics,  to  the  problems  of 
dynamical  systems  analysis.  BK  showed  how  the 
“deterministic”  components  of  a  time  series 
could  be  separated  from  the  noisy  or  ‘stochastic’ 
components,  and  reconstructed  in  a  statistically 
optimal  way  using  a  subset  of  the  most  significant 
singular  vectors.  In  subsequent  papers.  Broom- 
head  et  al.  [5,  6]  further  extended  SSA  to  ana¬ 
lyse  the  local  geometric  properties  of  a  recon¬ 
structed  attractor.  Such  methods  were  promoted 
as  offering  significant  advantages  in  dealing  with 
real  (i.e.  noise-contaminated)  data  over  earlier 
phase-portrait  techniques  based  on  MOD. 

In  the  present  paper,  we  apply  both  global  and 
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local  SSA  methods  to  the  analysis  of  high  preci¬ 
sion  measured  time  series  of  temperature  and 
total  heat  transport  in  a  rotating  annulus.  Ther¬ 
mal  convection  in  a  rotating,  cylindrical  fluid 
annulus,  differentially  heated  in  the  horizontal, 
has  been  studied  for  many  years  [7]  as  a  labora¬ 
tory  analogue  of  the  large  scale  circulation  of  a 
planetary  atmosphere.  Such  a  system  may  serve 
both  as  a  valuable  source  of  physical  insight,  as 
well  as  providing  a  useful  ‘test  bed’  for  the 
development  of  analysis  techniques  which  might 
eventually  be  applied  to  meteorological  data  and 
models.  The  annulus  is  its  typical  form  possesses 
circular  azimuthal  symmetry  about  the  rotation 
axis  in  its  boundary  conditions,  and  exhibits  a 
rich  variety  of  flow  regimes  depending  upon  the 
external  conditions  (principally  temperature  con¬ 
trast  AT  and  rotation  rate  fl).  These  regimes 
range  from  steady,  laminar  axisymmetric  convec¬ 
tion  (analogous  to  tropical  Hadley  flow,  at  low 
17),  through  regular  steady  or  periodic  baroclinic 
waves  (at  moderate  12)  to  highly  irregular 
aperiodic  ‘geostrophic  turbulence’  (at  high  12). 
As  in  the  mid-latitude  atmosphere  of  the  Earth, 
waves  develop  as  a  result  of  a  potential  energy¬ 
releasing  (baroclinic  [7])  instability  of  the 
azimuthally-symmetric  component  of  the  flow 
associated  with  buoyancy  contrasts  driven  by 
differential  solar  heating  between  equator  and 
poles.  In  studying  how  regular  flow  regimes  in 
the  rotating  annulus  break  down  into  more  com¬ 
plex  and  disordered  flows,  therefore,  insight  may 
be  gained  into  a  range  of  nonlinear  dynamical 
processes  which  promote  varying  degrees  of  dis¬ 
order,  and  which  may  also  contribute  actively 
towards  limiting  the  finite  predictibility  of  atmos¬ 
pheres  [8]  and  oceans  (if  not  necessarily  those  of 
the  Earth). 

In  the  present  work,  particular  emphasis  is 
placed  on  regions  of  parameter  space  close  to 
observed  transitions  to  ‘baroclinic  chaos’  ([9], 
hereafter  RBJS).  Section  2  briefly  describes  the 
experimental  system  and  analysis  methods,  and 
section  3  presents  some  results  from  the  applica¬ 
tion  of  global  SSA  to  the  data.  Some  of  the 


advantages  of  SSA  in  facilitating  the  extraction 
of  small  amplitude  components  of  the  signal  are 
demonstrated,  though  a  practical  limitation  of 
SSA  in  the  presence  of  a  noisy  time  series  with 
widely  separated  timescales  is  also  discussed.  An 
extension  to  the  conventional  SSA  is  proposed 
which  may  overcome  some  of  these  limitations. 
Section  4  presents  some  results  of  the  application 
of  local  SSA  to  quasi-periodic  in  and  chaotic 
flows,  and  some  concluding  remarks  are  offered 
in  section  5. 


2.  Apparatus  and  phase  portrait  reconstruction 

The  working  fluid  was  contained  in  a  cylindri¬ 
cal  annulus  between  two  coaxial  thermally- 
conducting  cylinders  at  radii  r  =  2.0  cm  and  r  = 
8.5  cm,  and  between  rigid  insulating  boundaries 
in  contact  with  the  fluid  at  z  =  0  and  z  =  14.0  cm 
respectively.  The  apparatus  was  rotated  about  its 
vertical  axis  of  symmetry  at  angular  velocity  fl 
and  differentially  heated  in  the  horizontal  at  the 
sidewalls  (the  outer  typically  being  the  warmer). 
The  annulus  was  designed  for  the  precision  mea¬ 
surement  of  fluid  and  boundary  temperatures 
and  of  total  heat  transport,  and  was  essentially 
the  same  as  described  by  Hignett  et  al.  [10]  and 
RBJS.  The  working  fluid  consisted  of  a  25% 
solution  by  volume  of  glycerol  in  water,  with  a 
Prandtl  number  of  26.4.  Temperatures  at  the 
boundaries  and  in  the  fluid  were  measured  using 
copper-constantan  thermocouples  (sensitivity 
~40p,V  per  K).  In  the  fluid,  32  thermocouples 
were  equally-spaced  in  azimuth  at  mid-height 
and  mid-radius,  enabling  the  azimuthal 
wavenumber  spectrum  to  be  obtained  readily  by 
fast  Fourier  Transf  m  techniques  [10].  The  total 
heat  transport  through  the  inner  side  boundary 
was  measured  using  the  method  described  by 
Hignett  [10]  from  the  coolant  (water)  flow  rate 
and  the  difference  in  temperature  between  the 
inlet  and  outlet.  Total  heat  transport  on  time- 
scales  as  small  as  10  s  could  be  measured  to  an 
absolute  precision  of  ±2%  .  though  much 
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smaller  relative  changes  (—parts  in  lO’)  could  be 
detected. 

A  variety  of  different  procedures  were  used  to 
take  in  and  analyse  the  data.  Short  time  series  of 
measurements  at  ill  thermocouples  in  the  fluid 
and  boundaries,  and  of  the  total  heat  transport, 
were  recorded  and  used  to  identify  the  dominant 
flow  type  and  to  measure  wave  drift  rates  and 
(where  appropriate)  frequencies  of  time  varia¬ 
tions  in  amplitude  or  structure.  In  the  work 
presented  here,  at  selected  points  in  parameter 
space  much  longer  time  series  (a)  of  temperature 
at  one  of  the  ring  thermocouples  in  the  fluid,  and 
(b)  the  total  heat  transport,  averaged  over  1-2  s, 
were  recorded  simultaneously  for  up  to  250  drift 
periods  of  the  dominant  wavenumber  (requiring 
up  to  20  h  of  measurements).  This  procedure  was 
adopted  as  a  compromise  between  the  more 
desirable  course  of  recording  simultaneous  mea¬ 
surements  at  many  different  locations  in  the  flow 
and  the  practical  limitations  of  the  data  acquisi¬ 
tion  system  (mainly  on  sampling  rate  and  storage 
capacity)  available  at  the  time  the  experiments 
were  carried  out. 

Phase  portraits  were  constructed  by  the  usual 
method  of  time-delay  embedding  or  ‘method  of 
delays’  (MOD)-[l,  11]),  in  which  scalar  series 
e.g.  of  temperature  T{t)  are  represented  as  a 
trajectory  in  a  /C-dimensional  embedding  space 
by  denoting  the  state  of  the  flow  at  time  t  by  the 
vector  [7(r),  T{t  +  t),  T(t  +  2t),  ,  T{t  + 

(K  -  1)t)].  This  global  embedding  was  further 
refined  by  the  use  of  SSA  (see  BK)  to  reproject 
the  trajectory  onto  a  statistically  optimum  ortho¬ 
gonal  basis.  The  latter  comprises  the  eigenvec¬ 
tors  of  the  n  X  n  covariance  matrix  computed 
using  a  sliding  ‘window’  of  n  points  (where  n 
maximum  required  embedding  dimension), 
which  is  stepped  along  the  time  series.  As  dis¬ 
cussed  in  BK,  this  method  avoids  some  of  the 
arbitrary  choices  necessary  in  the  simple  ‘method 
of  delays’  (especially  of  the  delay  time  t).  For 
the  present  work,  time  series  were  typically  sam¬ 
pled  at  1-2  s  intervals,  placing  200  or  more 
samples  per  wave  drift  period  (though  rather 


fewer  per  typical  period  of  amplitude  or  structur¬ 
al  modulation,  which  ranged  from  50-300  s).  A 
window  length  of  between  75  s  and  100  s 
proved  suitable  in  most  cases,  given  the  relative¬ 
ly  long  wave  drift  periods.  This  value  of  is 
somewhat  longer  than  would  be  suggested  from 
BK  (their  eq.  (3.20)),  though  is  still  somewhat 
less  than  most  typical  timescales  observed  in  the 
flow  (see  RBJS).  For  some  cases  to  be  discussed 
below,  however,  even  this  length  of  window 
proved  to  be  too  short  to  resolve  significant 
structure  in  the  reconstructed  attractor. 


3.  Global  analyses 

In  addition  to  the  improved  objectivity  in  con¬ 
structing  embeddings  using  MOD,  a  number  of 
other  advantages  for  ‘phase  portrait’  reconstruc¬ 
tion  were  claimed  by  the  original  promoters  of 
SSA  (e.g.  see  BK).  In  particular,  it  was  noted 
that  use  of  a  truncated  set  of  the  derived  eigen¬ 
vectors  effectively  introduces  a  filter  into  the 
analysis,  eliminating  many  of  the  unwanted  non- 
deterministic  components  of  the  signal.  In  the 
present  work,  a  50-point  window  {K  =  50,  see 
above)  was  used  except  where  stated,  and  analy¬ 
ses  performed  upon  reconstructions  involving 
only  the  first  few  (^6)  eigenvectors.  In  the  fol¬ 
lowing,  we  describe  two  examples  of  the  use  of 
SSA  in  the  analysis  of  periodic  and  chaotic  time 
series  obtained  in  the  apparatus  outlined  above, 
which  illustrate  (a)  some  of  the  advantages  of 
SSA  in  extracting  small  signal  components  in  a 
remarkably  efficient  way,  but  also  (b)  a  limita¬ 
tion  of  the  simplest  form  of  SSA  in  constructing 
embeddings  for  flows  containing  components 
modulated  in  amplitude  with  widely  differing 
timescales  in  the  presence  of  noise. 

In  the  following  sub-sections  we  illustrate  as¬ 
pects  of  the  use  of  SSA  in  the  analysis  of  time 
series  with  reference  to  a  number  of  data  series 
obtained  at  various  parameter  settings  in  the 
above  mentioned  apparatus.  The  cases  discussed 
include  a  sub-set  of  those  presented  by  RBJS, 
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and  comprise  four  separate  experiments  carried 
out  at  fixed  temperature  contrast  10  K  and 
variable  rotation  rate  fl.  The  salient  details  of 
each  run  are  summarised  in  table  1,  including  the 
values  of  the  relevant  non-dimensional  control 
parameters  (e.g.  see  [7]),  viz.  the  stability  pa¬ 
rameter  or  ‘thermal  Rossby  number’: 


^  ga  LT  D 

[n{b-a)f 


(3.1a) 


and  Taylor  number,  Ta,  taken  to  be 

(3.1b) 


v^D 


where  g  is  the  acceleration  due  to  gravity,  a  the 
volumetric  expansion  coefficient,  AT  the  im¬ 
posed  thermal  contrast,  O  the  system  rotation 
rate,  D,  a  and  b  the  annulus  depth,  inner  and 
outer  radius  respectively,  and  v  is  the  kinematic 
viscosity. 


3.1.  Separating  'fast'  waves  from  'slow’ 


One  of  the  most  remarkable  flow  regimes  of 
the  rotating  thermal  annulus  is  the  so-called 
‘steady  wave’  regime,  typically  found  over  a 
range  of  parameters  intermediate  between  the 
steady  axisymmetric  state  (at  low  O)  and  fully 
developed  ‘geostrophic  turbulence’  (obtained  at 
high  £1  [7]).  In  this  regime,  the  flow  pattern 
comprises  a  single  dominant  azimuthal 
wavenumber  component  and  its  harmonics, 
which  drifts  steadily  at  constant  amplitude 
around  the  apparatus  at  a  rate  determined  by  the 
flow  structure  and  apparatus  geometry  [7,  10]. 


Table  1 


Experimenta;  parameters. 


Case 

Flow  Type 

AT 

(K) 

a 

(rads') 

e 

TalxlO”) 

A 

m  =  3 

9.98 

2.030 

0.422 

5.86 

B 

m  =  3AV 

10.06 

1.873 

0.500 

4.99 

C 

m  =  3MAV 

10.01 

1.716 

0.593 

4.19 

D 

m  =  3SV 

9.95 

3.420 

0.148 

1^.6 

The  existence  of  such  a  pure  steady  wave  state 
has  been  disputed  by  Pfeffer  et  al.  [12,  13],  who 
report  a  direct  transition  between  states  with  the 
same  azimuthal  wavenumber  but  which  are 
either  modulated  periodically  in  amplitude  or 
structure  (known  respectively  as  ‘amplitude  vac¬ 
illation’  AV  and  ‘structural  vacillation’  SV;  see 
[7,  12]).  The  source  of  this  controversy  may  be 
due  in  part  to  a  matter  of  definition  (Hignett 
[14],  for  example,  defines  a  flow  as  AV  only  if 
amplitude  fluctuations  exceed  5%  of  the  mean 
amplitude).  RBJS  found  that  the  modulation 
index  tj  of  AV,  defined  as 


V  = 


A  —A 

^max  ^min 
■'‘^max  ^  min 


(3.2) 


(where  and  are  respectively  the  maxi¬ 
mum  and  minimum  wave  amplitudes)  decreased 
uniformly  towards  zero  with  increasing  O,  with 
no  clearly-defined  transition  point.  Even  so, 
flows  were  found  over  a  range  of  parameters  in 
which  amplitude  fluctuations  were  very  small  (tj 
typically  <10~^). 

Time  series  from  a  steady  wave  flow  would  be 
expected  to  take  a  very  simple  from,  in  which 
local  temperatures  oscillate  periodically  at  the 
drift  frequency  of  the  dominant  azimuthal 
wavenumber  and  the  total  heat  transport 
through  each  sidewall  is  constant.  Fig.  1  shows 
the  singular  spectrum  obtained  for  the  tempera¬ 
ture  measured  at  a  fixed  location  in  the  flow 
from  such  a  steady  wave  state  in  the  present 
apparatus  (case  A;  see  table  1),  in  which  each 
singular  value  (t„  is  normalised  to  indicate  the 
proportion  of  the  variance  accounted  for  by  each 
eigenvector  (a  typical  phase  portrait  constructed 
from  the  first  2-3  eigenvectors  for  this  flow, 
taking  the  form  of  a  limit  cycle,  is  illustrated  in 
fig.  9a  and  RBJS).  Thus,  some  99.9%  of  the 
variance  is  contained  in  the  first  three  eigenvec¬ 
tors,  though  some  16  components  appear  to  lie 
above  a  well-defined  ‘noise  floor’  around  al  = 
lOVo. 

The  apparent  pairing  of  eigenvalues  4  and  5 
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Fig.  1.  Singular  spectrum  obtained  from  a  time  series  of 
4  X  10'  temperature  measurements  in  a  steady  wave  flow  in 
the  rotating  annulus  experiments  of  [9].  Analysis  employed  a 
window  of  50  points  sampled  every  2  s  (t„  =  100  s),  and 
eigenvalues  have  been  normalised  to  show  the  proportion  of 
the  total  variance  represented.  The  95%  confidence  interval 
for  each  each  eigenvalue  [3,  IS]  corresponds  to  Acr/ 
or -0.6%. 

(and  others)  indicates  the  possibility  of  additional 
oscillatory  components  independent  of  the  domi¬ 
nant  mode  (cf.  [15]).  This  is  further  confirmed  in 
the  sequence  of  eigenvectors  c„  for  /i  =  1-5  (see 
fig.  2),  in  which  the  nodal  complexity  of  each 
eigenvector  increases  by  one  from  n  =  1-3  but 
jumps  sharply  for  /i  =  4  and  5  (figs.  2d  and  2e). 
Eigenvectors  and  Cj  both  contain  a  smoothly 
oscillating,  nearly-sinusoidal  component  at  high 
frequency  (period  ~  20  s)  and  form  a  complex 
pair  in  relative  phase  quadrature. 

On  projection  of  the  original  signal  onto  c,,  a 
smoothed  version  of  the  original  time  series  is 
recovered  (cf.  figs.  3a  and  3b).  Projection  of  the 
time  series  onto  or  Cj  (fig.  3c),  however, 
reveals  a  sequence  of  short  bursts  of  oscillation 
of  very  small  amplitude  (peak  amplitude  ~  0.1) 
at  a  frequency  <u~0.3rad  s~‘,  each  lasting 
around  200  s  and  repeated  at  the  same  point 


Fig,  2.  (a)-(f)  Profiles  of  the  first  six  singular  vectors  corre¬ 
sponding  to  the  singular  spectrum  shown  in  fig.  1. 

within  every  drift  period  of  the  dominant  wave 
(tj~900s).  The  rapid  oscillation  is  evidently 
strongest  just  after  the  passage  of  the  tempera¬ 
ture  minimum  in  the  main  drifting  wave.  Such 
weak  oscillations  around  the  temperature  mini¬ 
mum  are  just  visible  in  the  raw  data  itself  (see 
fig.  3a).  We  may  construct  a  Poincare  section 
around  the  local  minimum  in  the  signal  projected 
onto  c,  (by  taking  the  plane  C2  =  0),  and  plot 
intersections  in  the  sub-plane  (C4,  Cj)  to  obtain 
fig.  4.  This  clearly  shows  an  open  toroidal  struc¬ 
ture,  indicating  that  the  flow  is  actually  quasi- 
periodic  on  a  torus  of  very  narrow  cross-section. 
It  should  be  noted  that  such  a  clear  signature  of 
quasi-periodic  behaviour  could  not  readily  be 
obtained  without  the  use  of  SSA  (e.g.  using 
simple  MOD). 

These  bursting  oscillations  are  quite  weak  in 
amplitude,  and  are  undetectable  in  the  full  32- 
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Fig.  3.  (a)  Extract  from  the  temperature  time  series  obtained 
at  mid-depth  and  mid-radius  in  the  steady  wave  flow  regime 
of  the  rotating  annulus  (see  figs.  1  and  2);  (b)  projection  of 
the  time  series  onto  the  first  singular  vector  of  fig.  2(a);  (c) 
projection  onto  the  5th  singular  vector  (fig.  2e). 

point  thermocouple  data  (owing  to  a  reduced 
signal-to-noise  ratio).  We  therefore  have  very 
little  information  with  which  to  determine  the 
nature  of  this  phenomenon.  However,  the  clear 
synchronization  of  the  bursts  with  the  wave  drift 
would  suggest  that  the  oscillation  arises  from  an 
interaction  with  the  apparatus  itself,  perhaps 
even  with  the  thermocouple  probe  itself.  Fur¬ 
thermore,  the  observed  carrier  frequency  of  the 
bursts  is  quite  close  to  that  of  the  mean  buoyan¬ 
cy  frequency  N,  defined  as 

(3.3) 
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Fig.  4.  Poincare  section  obtained  from  the  temperature  time 
series  shown  in  fig.  3a  projected  onto  the  singular  vectors  of 
fig.  2.  Section  shows  the  intersections  of  the  reconstructed 
trajectory  with  the  plane  c,  =  0,  plotted  onto  the  sub-plane 

(C4,  C5). 

estimated  to  be  ~0.4  rad  s*'  from  the  fluid  prop¬ 
erties  and  experimental  parameters.  It  is  conjec¬ 
tured,  therefore,  that  these  weak  oscillations 
represent  a  form  of  inertia-gravity  wave  associ¬ 
ated  with  the  passage  of  a  certain  portion  of  the 
main  wave  past  a  stationary  topographic  obsta¬ 
cle,  such  as  the  thermocouple  ring,  though  more 
detailed  and  sensitive  spatially-resolved  measure¬ 
ments  and/or  a  numerical  simulation  would  be 
needed  to  confirm  this  interpretation. 

3.2.  Embedding  flows  with  slow  modulations 

In  the  previous  section,  a  clear  example  was 
presented  of  a  situation  in  which  global  SSA 
greatly  facilitated  the  extraction  of  a  weak  high 
frequency  periodic  signal  component  from  a 
much  stronger  oscillation  at  a  significantly  lower 
frequency.  With  the  benefit  of  hindsight,  it  must 
be  acknowledge  that  more  conventional  band¬ 
pass  filters  would  probably  be  equally  effective  in 
enabling  the  extract  of  the  high  frequency 
inertia-gravity  wave  signal  in  this  case.  It  is 
unlikely,  however,  that  such  methods  would 
have  given  such  immediate  prominence  as  SSA 
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to  this  component  of  the  time  series,  which  in 
this  case  prompted  further  investigation  of  a 
previously  unrecognized  dynamical  phenom¬ 
enon. 

In  the  following,  in  contrast  to  the  above,  we 
discuss  a  case  in  which  SSA  was  unable  to  pro¬ 
duce  a  satisfactory  embedding,  and  present  an 
extension  which  enables  the  construction  of  an 
improved  embedding  for  signals  comprising  a 
fast  oscillation  which  is  slowly  modulated  in  am¬ 
plitude.  As  an  example  of  this  type  of  signal, 
RBJS  discussed  a  baroclinic  flow  regime  in  which 
two  or  more  incommensurate  azimuthal  wave- 
number  components  were  each  modulated  in 
amplitude  at  two  distinct  and  widely  separated 
timescales.  This  regime  was  identifled  as  chaotic 
(see  RBJS),  and  arose  via  a  transition  from  a 
simpler  quasi-periodic  flow  in  which  a  single 
azimuthal  wavenumber  was  periodically  mod¬ 
ulated  in  amplitude  (i.e.  ‘amplitude  vacillation’ 
or  AV;  see  [7,  12]  and  case  B  of  table  1).  A 
typical  time  series  is  presented  in  fig.  5,  which 
shows  the  variation  with  time  of  the  total  heat 
transport  H(t).  The  heat  transport  signal  is  mod¬ 
ulated  about  its  mean  value  //  =  45  W  due  to  the 
fast  modulation  of  the  amplitude  of  the  domi¬ 
nant  wave  (largely  responsible  for  radial  heat 
transfer)  at  the  ‘vacillation  frequency’  a>„  (corre¬ 
sponding  to  a  period  around  150  s).  The  modula¬ 
tion  index  ij  of  the  ‘vacillation’  is  itself  mod¬ 
ulated  at  an  even  lower  frequency  ~o)^  (hence 
this  regime  is  referred  to  as  a  ‘modulated  am¬ 
plitude  vacillation’  or  MAV;  see  RBJS  and  case 


Fig.  5.  Extract  from  time  series  of  total  heat  flow,  obtained 
from  a  chaotically-modulated  baroclinic  wave  flow  (MAV- 
Case  C;  see  table  1  and  RBJS)  in  n  rotating  annulus. 


C  of  table  1),  though  the  long-period  secondary 
modulation  is  apparently  aperiodic  in  form  with 
a  period  —1500  s. 

Phase  portraits  derived  from  H(t)  for  this  case 
are  organised  about  a  limit  cycle  at  co^  but  are 
spread  by  the  slow  modulation  into  a  disk-like 
structure  of  finite  radial  extent  (see  RBJS).  A 
typical  Poincare  section  from  such  a  phase  por¬ 
trait  using  a  moderate  length  window  =  100  s) 
is  illustrated  in  fig.  6a,  showing  the  points  clus¬ 
tered  along  a  linear  structure  with  a  thickness 
comparable  to  the  scatter  attributable  to  ex¬ 
perimental  noise  and  drifts.  However,  indica¬ 
tions  that  this  disk-like  appearance  was  masking 
the  true  dynamics  were  provided  by  return  maps 
of  the  projection  of  the  Poincare  section  onto  the 
Cj  axis  (see  RBJS).  These  maps  showed  a  closed 
elliptical  form  which  suggested  that  the  H{t)  flow 
was  really  organised  about  a  torus,  but  that  the 
modulation  was  too  slow  to  change  its  phase 
significantly  within  either  one  period  of  the  main 
oscillation  at  t«>„  or  an  interval  of  t„.  Attempts 
were  made  to  vary  the  window  length  (up  to 

=  1000  s,  obtained  by  resampling  the  time 
series  more  sparsely  at  10s  intervals),  but  all 
failed  to  resolve  the  toroidal  nature  of  the  phase 
portrait  in  any  projection  except  when  — 
1000  s  (see  fig.  6c),  suggesting  a  fundamental 
inability  of  this  form  of  SSA  to  embed  mod¬ 
ulated  signals  with  widely  separated  carrier  and 
modulation  frequencies  (w„/a>„,  >  10)  unless 
(=2TT/a;„).  When  fig.  6c  clearly 

reveals  the  underlying  toroidal  structure  with  an 
intriguingly  clumped  distribution  of  points 
around  the  torus  ‘walls’  with  some  indication  of 
finger-like  extensions  suggestive  of  a  ‘wrinkling’ 
of  the  toroidal  surface. 

RBJS  were  able  to  produce  an  improved  em¬ 
bedding  of  this  flow  using  short  windows  by 
constructing  a  modified  state  space  which  em¬ 
ployed  two  timescales.  The  coefficients  c,(f). 
CjCOt  c^{t) .  .  .  ,  constructed  using  a  window 
length  of  only  t^  =  100  s,  were  regarded  as  an 
alternative  set  of  ‘observables’  in  the  sense  of 
Takens  [1],  and  a  new  delay  embedding  (termed 


Fig.  6.  Poincare  sections  obtained  in  SSA  coordinates  ((a)-(c))  from  the  heat  transport  time  series  shown  in  fig.  5,  and  ((d)-(f)) 
from  the  artificial  time  series  given  by  eq.  (3.2):  (a)  and  (d)  are  sections  in  normal  SSA  coordinates  with  =  100  s.  showing  the 
intersections  of  the  trajectory  with  the  plane  c,  =0  plotted  onto  the  sub-plane  (Cjj/),  c,(/));  (b)  and  (e)  are  sections  in  hybrid 
delay  SSA  coordinates  showing  intersections  with  the  same  plane  as  in  (a)  and  (d)  but  plotted  onto  the  sub-plane  (c,(/), 
+  To)),  where  Tp  =  350  s;  (c)  and  (f)  are  sections  in  normal  SSA  coordinates  with  =  1000  s,  showing  intersections  with  the 
plane  c,  =  0  on  the  sub-plane  (c,(0,  c.(')). 
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hereafter  ‘hybrid  delay  SSA’  or  HDSSA)  was 
then  constructed  by  taking  the  first  three  coordi¬ 
nates  at  time  t  to  be  (Ci(0,  Cjit),  €2(1  +  t^)  . .  .), 
where  Tq  was  chosen  to  take  account  of  the  slow 
modulation  with  a  period  Further  coordi¬ 
nates  could  comprise  projections  onto  higher 
eigenvectors,  with  or  without  additional  time 
delays.  In  the  present  example,  Tj,  ~  tJ 4,  where 
T„,=!  1500  s.  Fig.  6b  shows  the  same  Poincare 
section  as  in  fig.  6a  but  projection  onto  the  new 
basis,  and  now  clearly  shows  a  toroidal  structure 
(cf.  the  Poincare  section  with  =  1000  s  in  fig. 
6c).  The  thickness  of  the  torus  ‘walls’  is  some¬ 
what  greater  than  the  experimental  noise  and 
drift,  with  some  suggestion  of  undulations  in  the 
exterior  surface  as  noted  above  for  fig.  6c.  The 
robustness  of  such  structures  would  tend  to 
favour  the  interpretation  of  the  irregular  nature 
of  the  long-period  modulation  as  low-dimension¬ 
al  chaotic  behaviour,  as  inferred  by  RBJS. 

The  effectiveness  of  this  construction  in  resolv¬ 
ing  the  toroidal  structure  of  slowly-modulated 
quasi-periodic  signals  is  further  illustrated  in  figs. 
6d  and  6e,  which  show  the  same  analysis  applied 
to  a  surrogate  artificial  time  series  A(t)  gener¬ 
ated  from  the  function 

A{t)  =  Aq  -1-  (A,  +  i42  sin  w,/)  sin  ,  (3.4) 

with  (O2  =  <u„  and  w,  ~  (cf.  RBJS).  The  basic 
form  of  SSA  using  a  window  of  ~  100  s  pro¬ 
duces  extremely  elongated  Poincare  sections  in 
which  all  points  collapse  onto  a  near-vertical  line 
at  C3  =  0.  In  extreme  cases,  points  collapse  com¬ 
pletely  onto  a  line,  forming  a  section  which  is 
completely  degenerate  (see  fig.  6d).  With  larger 
windows,  Poincare  sections  projected  onto  high¬ 
er  order  eigenvectors  were  able  to  reveal  the 
toroidal  character  of  the  signal  reasonably  clear¬ 
ly,  though  still  exhibited  a  tendency  towards 
strongly  elongated  patterns  which  would  still  be 
difficult  to  distinguish  in  the  presence  of  noise 
(cf.  fig.  6f).  With  the  HDSSA  embedding,  how¬ 
ever,  the  Poincare  section  (fig.  6e)  opens  out  to 
reveal  the  toroidal  structure  quite  clearly.  In  ail 


the  above  discussion,  it  is  important  to  empha¬ 
sise  that  the  toroidal  nature  of  H{t)  could  not 
readily  be  established  using  the  simple  form  of 
MOD  with  any  reasonable  value  of  t,  despite 
extensive  attempts  to  investigate  a  wide  range  of 
projections,  because  of  the  high  degree  of 
dynamical  ‘noise’  and  long  correlation  timescales 
of  the  slow  modulation  in  the  signal  which  typi¬ 
cally  obscures  the  underlying  structure. 

4.  Local  analyses 

4.1.  Local  dimension 

Having  achieved  a  satisfactory  embedding  of  a 
trajectory  from  a  time  series,  it  is  then  common¬ 
ly  required  to  estimate  the  values  of  invariants 
characterising  the  reconstructed  attractor.  These 
may  include  various  measures  of  attractor  di¬ 
mension,  metric  entropy  and  the  spectrum  of 
Lyapunov  exponents  [11].  For  the  flows  dis¬ 
cussed  above,  RBJS  reported  estimates  of  the 
pointwise  dimension  [16]  and  largest  non¬ 
negative  Lyapunov  exponent  A,  (derived  using  a 
form  of  the  algorithm  due  to  Wolf  et  al.  [17]), 
and  Smith  [18]  has  presented  evidence  for  low- 
dimension  behaviour  in  several  cases  from  the 
same  data  using  an  extension  of  the 
Grassberger-Procaccia  correlation  dimension  es¬ 
timator.  A  significant  difficulty  with  this  ap¬ 
proach  to  dimension  estimation,  found  in  the 
analysis  of  the  flows  considered  here,  however,  is 
that  the  local  curvature  of  the  reconstructed 
attractor  varies  strongly  with  location  on  the 
attractor.  This  leads  to  difficulties  in  identifying  a 
range  of  length  scales  over  which  integrals  ex¬ 
hibit  identifiable  scaling  behaviour;  a  difficulty 
which  can  be  partially  overcome  by  the  use  of 
pointwise  dimension  and  other  local  estimators. 

4.2.  Local  SSA  and  topological  dimension 

Although  it  may  be  argued  [3]  that  the  singu¬ 
lar  spectrum  from  global  SSA  provides  an  esti- 
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mate  of  the  dimension  of  an  attractor  from  the 
number  of  significant  eigenvalues,  this  estimate 
is  not  robust  since,  in  general,  it  depends  upon 
the  window  length  (BK  [5,  15]).  Broomhead, 
Jones  and  King  [5]  proposed  an  alternative 
method  employing  SSA  to  examine  the  geometry 
of  localised  regions  of  an  attractor  where  the 
manifold  is  locally  linear,  and  thereby  to  obtain 
an  estimate  of  the  topological  dimension  k.  Their 
procedure  may  be  summarised  as  follows. 

A  reference  point  is  selected  on  the  recon¬ 
structed  attractor  with  position  vector  x  in  the 
space  spanned  by  the  first  d  eigenvectors  in  a 
global  SSA.  Nearby  points  within  a  ball  of  radius 
e  are  identified  and  the  £-neighbourhood  matrix 
Bj(x)  constructed  with  rows  comprising  the  vec¬ 
tors  {(Xj  —  x)^ :  |X|  -  x|  <  e}.  The  singular  vec¬ 
tors  of  are  obtained  and  correspond  to  an 
optimal  orthogonal  coordinate  system  centered 
on  X.  As  discussed  by  [5],  the  geometry  of  the 
manifold  is  characterised  from  a  consideration  of 
the  variation  of  the  magnitudes  of  the  singular 
values  (Tj  (equal  to  the  square  root  of  the  eigen¬ 
values)  of  Bg  as  the  ball  radius  e  is  varied.  For 


small  enough  e,  a  neighbourhood  of  an  m- 
manifold  in  will  have  k  approximately  equal 
singular  values  above  a  ‘noise  floor’,  and  these 
will  grow  linearly  with  e  until  saturation  or  the 
effects  of  curvature  in  the  manifold  become  ap¬ 
parent.  The  remaining  {or,:  i>  k}  should  remain 
approximately  constant  until  values  of  e  where 
they  are  significantly  affected  by  curvature.  Sin¬ 
gular  values  affected  by  curvature  then  vary 
approximately  as  e~  or  faster  until  they  also 
saturate. 

This  behaviour  is  clearly  seen  in  an  example 
from  the  rotating  annulus  taken  from  RBJS,  for 
which  the  flow  comprises  a  quasi-periodic  ‘am¬ 
plitude  vacillation’  (case  B  of  table  1).  The  re¬ 
sulting  temperature  series  T(t)  at  a  fixed  location 
in  the  flow  contains  two  dominant  frequencies 
and  respectively  (where  is  the  drift  fre¬ 
quency  of  the  dominant  azimuthal  wavenumber), 
and  phase  portraits  clearly  show  the  trajectory  to 
lie  on  a  well-defined  two-torus.  A  typical  Poin¬ 
care  section  from  such  a  flow  is  shown  in  fig.  7a. 
Fig.  7b  shows  the  result  of  a  local  SSA  analysis, 
centered  on  the  point  indicated  by  the  cross  in 


Fig.  7.  Poincare  section  (a)  and  local  SSA  analysis  (b)  from  a  temperature  time  series  obtained  in  a  periodically-modulated 
regular  baroclinic  wave  flow  (AV  -  Case  B;  see  table  1  and  RBJS):  (a)  section  in  SSA  coordinates  showing  intersectit>ns  with 
c,  =  0  plotted  onto  the  plane  (Cj,  c,);  (b)  local  analysis  centred  on  the  p»oint  indicated  by  the  cross  in  (a)  over  the  neighbourhood 
enclosed  within  the  dashed  circle.  Note  that  the  95%  confidence  interval  on  the  relative  magnitude  of  eigenvalues  [.t,  15|  varies 
with  e  from  ±3%  at  the  largest  values  of  e  to  ±50%  at  small  e. 
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fig.  7a  and  carried  out  for  d  =  7  over  a  neighbour 
extending  out  to  the  radius  indicated  by  the 
dashed  circle.  For  e  ^  0.6,  a  pair  of  singular 
values  dominate  and  both  clearly  scale  as  e  over 
almost  the  entire  range  considered.  The  remain¬ 
ing  cr,  are  approximately  constant  for  e  rs  0.3, 
after  which  o-j  grows  rapidly,  reaching  an  am¬ 
plitude  comparable  to  o-,  and  cr,  around  £  =  1. 
Saturation  effects  are  apparent  in  <r,  -  beyond 
e  =  1 ,  around  which  the  ball  size  becomes  com¬ 
parable  with  the  walls  of  the  torus  (cf.  fig.  7b). 

4.3.  Local  analysis  and  HDSSA 

For  the  MAV  flow  discussed  above  in  section 
3.2  (case  C  of  table  1),  it  was  noted  that  global 
SSA  can  fail  to  produce  a  satisfactory  embedding 
for  amplitude-modulated  flows  with  widely  dif¬ 
fering  carrier  and  modulation  timescales  unless 

~  .  It  might  be  expected  that  this  could 

influence  adversely  the  estimation  of  invariants 
from  the  reconstructed  attractor.  As  discussed  by 
RBJS,  pointwise  dimension  estimates  for  H{t) 
from  the  flow  discussed  in  3.2  indicated  a  value 
for  Dp  around  2  for  the  scales  resolvable,  while 


Lyapunov  exponent  estimates  suggested  a  signifi¬ 
cantly  positive  value  for  A,  indicative  of  low¬ 
dimensional  chaos.  The  results  of  section  3.2 
above,  however,  suggest  that  much  of  the  two- 
dimensional  character  of  the  H{t)  attractor  may 
be  due  to  the  failure  of  the  embedding,  which 
causes  the  (basically  toroidal)  attractor  to  col¬ 
lapse  onto  a  sheet-like  structure. 

This  apparent  two-dimensional  character  of 
the  attractor  reconstructed  using  simple  SSA 
with  short  windows  is  also  reflected  under  local 
SSA  analysis,  except  at  small  e.  Fig.  8a  shows  a 
typical  local  SSA  analysis  for  d  =  7,  centered  at 
the  point  indicated  by  the  cross  in  fig.  6b  and 
surrounded  by  the  dashed  circle  at  e  =  20.  Two 
singular  values  scaling  as  e  clearly  emerge  for 
virtually  all  values  of  e.  o-j  and  appear  to  scale 
as  though  affected  by  local  curvature  around 
£~2-3  until  they  saturated  for  £  >5.  When  the 
HDSSA  method  is  used  to  construct  the  embed¬ 
ding,  however,  evidence  for  some  higher  dimen¬ 
sional  activity  is  revealed.  Fig.  8b  shows  the  local 
SSA  analysis  comparable  to  that  of  fig.  8a  for  the 
same  region  of  the  attractor,  embedded  using 
d  =  7  for  global  coordinates  given  by 


Radius  e  '  Radius  e 

Fig.  8.  Lxx:ai  SSA  analyses  from  the  heat  transport  time  series  shown  in  figs.  5  and  6  (a  and  b)  in  normal  global  SSA  coordinates 
(a)  and  HDSSA  coordinates  (b).  Analyses  are  centred  on  the  points  indicated  by  crosses  over  the  neighbourhoods  enclosed  within 
the  dashed  circles  in  figs.  6a  and  6b  respectively.  Confidence  intervals  on  eigenvalues  vary  over  the  same  range  as  in  fig.  7. 
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x(t)  =  (Ci'(0.  C^(t  +  Td),  Cj(f)  .  .  .  c^(/)) . 

(4.1) 

In  this  case,  the  analysis  shows  three  dominant 
singular  values  which  scale  as  e,  the  others  re¬ 
maining  roughly  constant  except  for  £  >  15 
where  curvature  effects  start  to  become  signifi¬ 
cant.  tTj  appears  to  scale  as  around  e~3, 
though  the  statistical  error  on  these  points  is 
relatively  large  owing  to  the  relatively  few  points 
remaining  inside  the  ball.  This  would  indicate 
that  the  topological  dimension  k  =  3  for  this 
flow,  which  is  more  likely  to  be  consistent  with 
the  low-dimensional  chaotic  behaviour  inferred 
from  Aj  and  the  evidence  for  simple  spatial  struc¬ 
ture  etc.  (see  RBJS).  Curiously,  the  estimate  for 
A,  was  scarcely  affected  by  changing  to  the  em¬ 
bedding  defined  by  (4.1)  in  estimates  derived 
using  the  Wolf  et  al.  [17]  algorithm.  Values  were 
found  typically  to  reduce  by  only  ~10%  from 
those  obtained  using  the  simple  global  SSA  em¬ 
bedding  with  ~  100  s. 

4.4.  A  transition  to  chaos  in  four  dimensions! 

The  other  principal  type  of  transition  to  dis¬ 
ordered  flow  observed  in  the  thermal  annulus 
entails,  as  an  intermediate  stage,  bifurcations 
from  the  steady  wave  state  (cf.  section  3.1)  to 
one  in  which  rapid  fluctuations  take  place  pri¬ 
marily  in  the  structure  of  the  dominant  wave 
(so-called  ‘structural  vacillation’  or  SV  [7];  see 
case  D  of  table  1).  This  SV  state  is  now  widely 
recognized  as  the  first  stage  towards  the  emer¬ 
gence  of  fully-developed  ‘geostrophic  turbu¬ 
lence’,  in  which  the  flow  structure,  dominant 
wavenumber  and  amplitude  is  constantly  chang¬ 
ing  in  a  disordered  and  aperiodic  manner  (e.g. 
see  RBJS  [13]).  Such  a  complex  flow  is  likely  at 
least  to  represent  chaotic  flow  on  an  attractor  of 
relatively  high  dimension  (assuming  it  is  chaotic 
in  a  formal  sense  at  all -e.g.  see  RBJS).  Guc- 
kenheimer  and  Buzyna  [19]  presented  evidence 
from  rotating  annulus  experiments  which  indi¬ 
cated  an  increase  in  attractor  dimension  up  to 


Z)p  — 7  as  the  SV  regime  approached  ‘geos- 
trophic  turbulence'. 

In  the  present  work,  RBJS  presented  esti¬ 
mates  of  Dp  spanning  the  SV  regime  which 
suggested  an  increasing  attractor  dimension, 
though  none  of  the  estimates  exceeded  Dp  =  4. 
There  were,  however,  some  inconsistencies  be¬ 
tween  dimension  estimates  obtained  from  simul¬ 
taneous  T(t)  and  H(t)  time  series,  and  evidence 
for  satisfactory  scaling  regions  was  often  margi¬ 
nal,  suggesting  that  the  estimates  of  Dp  in  this 
regime  were  poorly  defined.  Fig.  9  shows  two 
examples  of  phase  portraits  obtained  from  T(t) 
spanning  the  transition  from  steady  wave  flow  to 
SV.  Fig  9a  and  9c  show  the  phase  portraits  (a) 
from  a  steady  wave  flow  (case  A;  see  table  1), 
and  (b)  from  a  well  developed  SV  (case  D;  see 
table  1).  As  remarked  by  RBJS,  the  onset  of  SV 
appears  to  result  in  a  uniform  thickening  of  the 
steady  wave  limit  cycle  via  irregular  bursts  of 
high  frequency  oscillations,  but  with  little  evi¬ 
dence  of  any  systematic  structure;  the  SV  flow 
appears  to  be  immediately  chaotic  (via  a  narrow 
band  of  temporal  intermittency  in  parameter 
space)  with  no  intermediate  quasi-periodic  state 
(see  RBJS). 

Figs.  9b  and  9d  show  typical  local  SSA  analy¬ 
ses  for  these  reconstructed  attractors,  centered 
upon  the  points  in  figs.  9a  and  (c)  surrounded  by 
the  dashed  circles,  again  for  d  =  7.  Fig.  9b  shows 
only  one  singular  value  rising  above  the  noise 
level  (around  a-  =  10"^)  and  scaling  as  e.  To 
within  the  limitations  of  experimental  noise  and 
drifts,  therefore,  A  =  1  for  this  flow,  confirming  it 
as  a  simple  steady  wave.  In  fig.  9d,  however, 
there  is  clearly  more  structure  in  evidence.  The 
limit  cycle  about  which  the  flow  is  organised, 
representing  quasi-steady  wave  drift,  dominates 
the  singular  spectrum  for  e>0.5.  In  the  range 
0.2:Se<0.5,  however,  four  singular  values  are 
found  to  be  of  comparable  magnitude  and  scale 
roughly  as  e,  indicating  a  dimension  k  =  4  within 
the  resolution  of  the  data.  For  e  <0.2,  all  singu¬ 
lar  values  are  comparable  with  the  noise  and 
ill-defined  because  of  poor  statistics.  This  would 
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Fig.  9.  SSA  phase  portraits  ((a)  and  (c))  and  local  SSA  analyses  ((b)  and  (d))  from  temperature  time  series  obtained  from  a 
steady  baroclinic  wave  flow  (Case  A;  see  table  1)  (a),  (b)  and  a  chaotic  structurally-modulated  baroclinic  wave  (SV  -  Case  D;  see 
table  1  and  RBJS)  ((c),  (d)).  Local  analyses  are  centred  upon  the  points  enclosed  within  the  dashed  circles  shown  in  (a)  and  (c) 
respectively.  Confidence  intervals  on  eigenvalues  vary  over  the  same  range  as  in  fig.  7. 


seem  to  support  the  conclusion  (see  RBJS)  that 
SV  may  be  consistent  with  low-dimensional 
chaos,  with  an  attractor  dimension  in  T{t)  of 
3<J)p<4. 

5.  Discussion 

Results  presented  above  have  clearly  demon¬ 
strated  the  effectiveness  of  global  SSA  in  extract¬ 
ing  very  weak  bursts  of  high  frequency  oscilla¬ 
tions  from  a  signal  dominated  by  a  much 
stronger  component  (~10“  larger  in  amplitude) 
of  complex  structure  at  a  much  lower  frequency. 
It  is  noteworthy  that,  in  contrast  to  more  con¬ 


ventional  filtering  methods,  no  tuning  of  the 
technique  was  necessary  in  order  to  achieve  a 
very  clear  separation  of  the  two  signal  compo¬ 
nents.  The  present  work  has  also  shown,  how¬ 
ever,  that  embeddings  obtained  using  SSA  can 
apparently  fail  when  the  signal  comprises  a  very 
slow  periodic  or  irregular  modulation  of  a  high 
frequency  carrier  unless  relatively  large  windows 
(based  on  the  slow  timescale  in  the  signal)  are 
employed  (note  that  this  also  applies  to  the  case 
discussed  in  section  3.1),  which  may  lead  to 
excessive  computational  expense  in  some  cases. 
Although  alternative  methods  are  available  from 
engineering  practice  to  treat  the  demodulation  of 
amplitude-modulated  carrier  signals  (e.g.  [20]), 
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the  HDSSA  approach  proposed  herein  would 
seem  to  offer  a  simple  extension  to  the  more 
conventional  MOD  which  can  readily  enable 
such  signals  to  be  embedded  successfully  for 
phase-plane  analysis  in  cases  where  it  may  be 
impractical  to  use  very  long  windows. 

Local  SSA  is  shown  to  provide  a  viable  alter¬ 
native  means  of  analysing  the  local  structure  of  a 
reconstructed  attractor,  if  used  with  care  and 
providing  the  underlying  global  embedding  is 
well-posed.  In  the  present  work,  we  have  ob¬ 
tained  estimates  of  topological  dimension  which 
are  consistent  with  alternative,  more  convention¬ 
al,  dimension  estimates,  and  provide  supporting 
evidence  for  two  distinct  types  of  chaotic  be¬ 
haviour  of  low  dimension  in  a  rotating,  thermally 
stratified  fluid  (so-called  ‘baroclinic  chaos’).  It  is 
important  to  remark,  however,  that,  like  the 
more  conventional  box-counting  or  correlation 
dimension  estimators,  local  SSA  does  rely  on 
being  able  to  identify  scaling  behaviour  over  a 
reasonable  range  in  radius  of  the  chosen  neigh¬ 
bourhood;  a  property  which  experience  suggests 
may  not  always  be  straightforward  to  find,  and 
which  places  considerable  demands  on  data  qual¬ 
ity  and  quantity  for  attractor  dimensions  signifi¬ 
cantly  greater  than  around  3.  The  apparent  suc¬ 
cess  of  local  SSA  in  the  present  context,  how¬ 
ever,  does  provide  some  encouragement  to  apply 
the  approach  to  the  evaluation  of  more  complex 
quantities  such  as  the  estimation  of  the  spectrum 
of  Lyapunov  exponents  [6]. 
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A  p)ositive  Lyapunov  exponent  can  result  from  either  chaotic  dynamics  or  noise.  As  a  possible  means  of  assessing  the 
degree  to  which  the  exponent  is  noise-affected,  and  as  a  possible  index  in  its  own  right,  the  localised  divergence  of  two 
neighbouring  trajectories  from  a  fiducial  trajectory  is  compared,  in  the  expectation  of  finding  strong  correlation  when  the 
source  of  stretching  is  deterministic  chaos,  and  little  or  no  correlation  when  the  source  is  coloured  or  uncoloured  noise. 
The  method  is  applied  to  various  equation  systems  and  to  a  periodic  time  series  from  the  self-excited  oscillation  of  flexible 
tube  collapsed  by  external  pressure  and  conveying  a  flow. 


1.  Introduction 

This  work  arose  out  of  investigations  of  the 
intriguing  properties  of  a  collapsed  tube  convey¬ 
ing  a  flow.  In  concept,  this  is  a  model  of  many  of 
the  conduits  in  the  human  body.  The  essentials 
are  that  the  tube  walls  are  flexible,  and  that 
pressure  outside  exceeds  that  inside.  This  situa¬ 
tion  occurs  in  the  circulation  in  the  larger  veins, 
and  also  in  the  pulmonary  airways,  in  the  urethra 
and  in  the  larynx.  When  a  cuff  is  applied  to  the 
arm  for  the  measurement  of  blood  pressure, 
these  conditions  occur  in  the  brachial  artery.  A 
comprehensive  list  of  the  physiological  occurr¬ 
ences  is  given  by  Shapiro  [1]. 

The  self-excited  oscillations  that  such  a  system 
will  undergo  are  of  particular  interest.  These 
oscillations  are  the  basis  of  lung  wheezing,  of  the 
action  of  the  vocal  cords,  and  of  the  vibration  of 
a  trumpet-player’s  lips,  and  are  related  to  events 
underlying  the  Korotkov  sounds  heard  by 
stethoscope  or  microphone  downstream  of  the 
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arm  cuff.  We  seek  to  understand  the  mechanism 
of  the  oscillations  through  analysis  of  experimen¬ 
tally  observable  behaviour,  since  this  system, 
although  apparently  simple,  has  proved  resistant 
to  quantitatively  accurate  theoretical  prediction 
[21. 

In  the  course  of  analysis  of  the  aperiodic  oscil¬ 
lations,  we  find  that  the  dynamics  appear  to 
include  an  inherent  noise  component  consider¬ 
ably  in  excess  of  that  due  to  measurement.  This, 
the  experimental  viewpoint,  is  better  expressed 
analytically  by  saying  that  the  dynamics  appear 
very  complicated  or  high-dimensional.  We  are 
thus  led  to  consider  ways  to  quantify  an 
aperiodic  system  that  will  differentiate  between 
deterministic  and  noise  sources  of  orbital  diver¬ 
gence. 

2.  The  experiment 

The  system  under  consideration  is  shown  in 
fig.  1.  It  consists  of  a  fairly  short  uniform  seg¬ 
ment  of  silicone  rubber  tube,  mounted  between 
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Constant-pressure 
air  supply 


Fig.  1.  Schematic  of  the  system  investigated  experimentally,  showing  the  flexible  tube  mounted  in  the  pressure  chamber  and 
perfused  from  a  constant-head  reservoir.  The  fluid  inertia  and  resistance  to  flow  both  upstream  and  downstream  of  the  flexible 
tube  are  calibrated  parameters.  The  control  variables  are  the  pressures  in  the  reservoir  (p„)  and  the  chamber  (pj.  The  time 
signals  recorded  are  the  pressures  (p,,  p,)  and  volume  flow-rates  into  and  out  of  the  tube,  and  the  cross-sectional  area  at  the  site 
of  maximal  oscillation,  near  the  downstream  end. 


rigid  pipes  which  form  the  opposite  ends  of  a 
pressure  chamber.  Compressed  air  in  the  cham¬ 
ber  causes  the  tube  to  collapse,  or  flatten  out.  In 
cross-section  the  tube  is  then  dumbbell-shaped, 
with  opposite-wall  contact  in  the  middle.  Water 
is  propelled  through  the  tube  from  a  constant- 
head  reservoir. 

Behaviour  depends  on  at  least  ten  dimension¬ 
less  parameters  [3].  There  are  the  tube  parame¬ 
ters  of  length  and  wall  stiffness,  plus  wall  mass 
and  longitudinal  tension.  The  fluid  density  and 
viscosity  set  the  Reynolds  number.  In  addition 
the  rest  of  the  apparatus  contributes  fluid  inertia 
and  resistance  to  flow,  both  upstream  and  down¬ 
stream  of  the  collapsible  tube. 

A  two-dimensional  slice  of  that  parameter 
space  has  been  characterised,  by  varying  the 
tube  length  and  the  flow  resistance  downstream. 
For  each  of  twelve  locations  on  that  slice,  a 
control-space  diagram  in  two  further  dimensions 
has  been  drawn  up  [4].  The  axes  of  this  are  the 
control  pressures  setting  the  flow-rate  and  the 
degree  of  collapse.  Each  diagram  includes  vari¬ 
ous  zones  of  different  behaviour.  These  be¬ 
haviours  include  steady  flow,  both  when  the  tube 
is  open  and  when  it  is  collapsed,  various  differ¬ 


ent  types  of  oscillation,  defined  by  frequency 
range  or  waveform  shape,  and  exponential  in¬ 
stability  or  divergence. 

A  rich  variety  of  qualitatively  distinct  modes 
of  oscillation  has  been  recorded.  For  those  oscil¬ 
lations  which  are  markedly  aperiodic,  we  seek  to 
demonstrate  underlying  order.  This  is  equivalent 
to  asking  whether  we  have  low-dimensional 
chaos,  since  high-dimensional  deterministic 
chaos  and  non-deterministic  variation  are  equally 
intractable  as  far  as  today's  analysis  tools  are 
concerned.  Assuming  provisionally  the  presence 
of  low-dimensional  chaos,  we  seek  to  character¬ 
ise  the  degree  of  chaos,  despite  having  ac-'ess 
only  to  experimental  data  with  an  inevitably 
limited  signal-to-noise  ratio.  Most  investigations 
overcome  this  difficulty  by  substitu'  ng  a  low¬ 
dimensional  simplified  model  of  th.  experiment. 
Such  models  exist  for  this  system  (e.g.  ref.  [3]) 
but  have  been  shown  to  represent  the  ex¬ 
perimental  system  inadequately.  Even  the  most 
recent  theoretical  mode’s  (e.g.  ref.  [5])  fail  in 
their  quantitative  predictions.  A  further  compli¬ 
cation  of  this  system  is  that  even  if  there  is  a 
chaotic  attractor,  it  is  not  necessarily  intrinsic  to 
the  self-excited  oscillator.  It  might  be  that  at 
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some  operating  points  the  oscillator  simply  be¬ 
comes  sensitively  dependent  on  the  time-varying 
details  of  the  arriving  turbulent  flow. 

3.  Methods  of  characterisation 

One  well-known  way  to  reduce  the  complexity 
of  a  system  is  to  apply  the  techniques  of  singular 
value  decomposition  (SVD)  as  developed  for 
this  purpose  by  Broomhead  and  King  [6].  This 
method  gives  information  including  the  maxi¬ 
mum  sensible  embedding  dimension  and  an  esti¬ 
mate  of  attractor  dimension  at  localised  points 
[7],  but  can  be  regarded  for  these  purposes  as 
simply  a  particularly  adept  way  of  low-pass  fil¬ 
tering. 

A  computationally  efficient  way  then  to 
characterise  the  degree  of  chaos  is  the  positive 
Lyapunov  exponent,  which  by  quantifying  how 
fast  neighbouring  points  get  separated  indicates 
how  rapidly  the  ability  to  predict  future  position 
is  lost.  Wolf  et  al.  [8]  have  promoted  this  index 
and  provided  algorithms  for  use  with  experimen¬ 
tal  data. 


The  problem  with  A,,  the  global  Lyapunov 
exponent,  is  that  its  value  depends  on  the  noise 
level;  see  fig.  2,  where  the  log  of  the  A, -value  is 
plotted  against  the  log  of  the  inverse  of  the 
signal-to-noise  ratio.  We  find  this  noise-to-signal 
approach  logical,  in  that  we  add  noise  to  the  data 
as  a  way  of  investigating  its  effects.  The  signal- 
to-noise  ratio  is  estimated  either  from  phase 
portraits  or  from  the  amplitude  spectrum. 

Our  phase-portrait  method  is  explained  with 
reference  to  fig.  3,  a  phase-portrait-style  plot  of 
one  state  variable  against  another.  This  example 
is  close  to  periodic,  although  there  is  a  secondary 
oscillation  during  one  phase  of  the  cycle  which 
produces  a  two-torus.  A  measure  of  the  ex¬ 
perimental  signal-to-noise  ratio,  i.e.,  that  which 
characterises  the  transducers  and  data  acquisi¬ 
tion,  is  provided  by  comparing  the  strip  width 
where  all  trajectories  converge  maximally  with 
the  overall  size  of  the  limit  cycle.  The  procedure 
assumes  that  there  is  a  part  of  the  cycle  when  the 
dynamical  system  follows  consistently  the  same 
trajectory.  To  the  extent  that  this  assumption  is 
false,  the  estimate  of  signal-to-noise  will  be  con¬ 
servative. 
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Fig.  2.  The  iqgahthm  (base  10)  of  the  noise-to-signal  amplitude  ratio  versus  the  logarithm  of  the  largest  Lyapunov  exponent 
(A,),  for  aperiodic  experimental  data  with  added  pseudo-random  noise  of  varying  amplitude. 
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Fig.  3.  Typical  phase-plane  portrait  of  p,  versus  pj  during  low-frequency  oscillation.  The  topology  is  basically  that  of  a  limit  cycle, 
which  by  virtue  of  a  secondary  oscillation  of  randomised  phase  around  part  of  the  cycle  becomes  a  two-torus. 


Alternatively,  when  there  is  marked  trajectory 
variation  at  all  phases  of  the  cycle,  the  noise 
level  can  be  estimated  from  the  Fourier  spec¬ 
trum.  The  peak-to-peak  amplitude  of  pseudo¬ 
random  white  noise  giving  the  same  level  of 
noise  floor  to  the  spectrum  as  that  present  at  the 
high-frequency  end  of  the  experimental  spectrum 
is  found  and  compared  with  the  peak-to-peak 
signal  amplitude.  Since  the  spectral  noise  floor 
for  the  signal  is  not  completely  flat  and  devoid  of 
signal  power,  this  method  too  gives  a  conserva¬ 
tive  estimate  of  the  signal-to-noise  ratio. 

Having  estimated  the  signal-to-noise  ratio  of 
the  original  data,  we  plot  this  A,  at  the  left-hand 
side  in  fig.  2.  Then  pseudo-random  noise  of 
varying  amplitude  is  added  to  the  data  and  A,  is 
re-calculated  each  time.  The  value  is  seen  to 
increase  along  with  the  noise  level.  There  is  a 
suggestion  of  a  leveling-out  to  a  constant  value 


on  the  left,  and  indeed  for  attractors  based  on 
equations,  for  which  a  low  noise-to-signal  ratio 
can  be  obtained,  a  constant  value  is  reached,  as 
shown  in  fig.  4.  Here  an  example  of  a  periodic 
system  (a  noisy  sine  wave)  is  compared  with  the 
Lorenz  attractor.  The  Lorenz  returns  a  signifi¬ 
cantly  positive  A,,  the  sine  wave  an  insignificant¬ 
ly  positive  one.  (Significance  is  defined  by  com¬ 
puting  the  cumulative  mean  of  the  localised 
stretch  and  its  standard  error  once  the  cumula¬ 
tive  average  has  ceased  to  converge.) 

Now  A,  is  the  cumulative  average  of  all  the 
local  stretching  estimates,  each  such  estimate 
being  computed  as 

^  mi  Lit  -  N  hi)  ' 

where  L  is  the  distance  between  the  fiducial  orbit 
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Fig.  4.  As  fig.  2,  but  for  (•)  a  sine  wave  and  for  (O)  the  Lorenz  attractor  (o-  =  10,  6  =  5 ,  /?  =  28). 


point  and  the  neighbour-orbit  point,  ht  is  the 
data  sampling  interval  and  N  is  an  integer.  What 
extra  information  is  available  from  looking  in 
detail  at  the  individual  stretch  estimates?  On  a 
noise-free  attractor,  regions  of  stretching  and 
folding  should  give  rise  to  somewhat  similar 
behaviour  on  many  different  orbits.  The  main 
idea  explored  in  this  paper  is  to  compute  not  one 
but  two  local  stretchings  (5,  and  S2),  by  tracking 
the  divergence  of  two  neighbouring  trajectories 
from  a  fiducial  trajectory.  (The  procedure  has 
similarities  with  the  use  of  three  trajectories  to 
compute  A  +  Aj.  But  in  this  case  we  are  con¬ 
cerned  with  linear  distance  of  each  neighbour 
from  the  fiducial  orbit  instead  of  the  area  of  the 
triangle  formed  by  the  current  point  on  each  of 
the  three  orbits.)  We  hypothesised  that  the  be¬ 
haviour  of  two  such  stretchings  should  be  highly 
correlated  when  the  stretching  is  the  result  of 
deterministic  processes,  and  uncorrelated  when 
the  computed  divergence  is  the  result  of  noise. 
We  postulated  that  such  a  comparison  of  5,  and 
S2  might  usefully  distinguish  between  chaos  and 
noise  where  A,  does  not,  and  in  particular  might 
help  to  quantify  chaos  in  the  presence  of  noise. 


4.  Implementation 

Except  where  embedding  dimension  is  specifi¬ 
cally  varied,  or  where  the  equation  system  is 
four-dimensional,  all  calculations  have  been 
done  in  a  three-dimensional  embedding  space, 
using  two  equal  delays  to  generate  coordinates 
from  a  one-dimensional  array  of  x-values  from 
the  numerical  integration  of  the  equations.  We 
concentrate  on  single-centre  attractors  (e.g.  the 
Rossler,  Shaw’s  velocity-forced  Van  der  Pol  sys¬ 
tem),  as  opposed  to  two-centre  systems  (Lorenz, 
double  potential  well),  since  the  experimental 
data  fall  in  this  category.  For  these  systems  we 
can  define  an  arbitrary  centre  around  which  or¬ 
bits  rotate  as  seen  in  two-dimensional  projection. 
Many  orbits  can  then  be  compared  in  terms  of  an 
angle  <f>  which  defines  the  region  of  phase  space. 
The  local  stretchings  are  then  distributed  for  the 
Rossler  attractor  as  shown  in  fig.  5.  Classically, 
Lyapunov  exponents  are  expressed  as  bits/ 
second  of  lost  information,  but  conversion  to 
bits /orbit  allows  comparison  of  different  attrac¬ 
tors.  Points  above  zero  represent  divergence  of 
neighbouring  trajectories,  and  those  below,  con- 
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angle  ( degree ) 


Fig.  5.  Position  in  terms  of  angular  location  (4>)  on  the  reversed-time  Rossler  attractor  (a  =  0. 15,  b  =  0.2,  c  =  10)  versus  the  two 
local  stretchings  S,  and  S2  (see  text).  The  computation  used  2048  data  points  calculated  at  bt  =  0.3  s,  with  stretch  calculated  over 
each  five  ht. 

vergence.  In  keeping  track  of  the  two  neighbour-  nearby  orbit,  is  lower.  But  clearly  both  see  the 

ing  orbits,  if  the  current  distance  to  one  (former)  same  features  in  the  attractor,  whether  they  be 

neighbour  exceeds  the  maximum  allowed,  both  divergence,  convergence  or  cross-over,  which  is 

neighbours  are  replaced  simultaneously.  As  a  a  combination  of  the  two. 

result  there  are  more  replacements  overall  and  The  criteria  to  be  satisfied  by  a  replacement 

the  global  mean  of  each  of  the  two  stretchings  is  mean  that  the  chosen  orbit  is  not  always  the 

slightly  altered;  5,  is  higher  than  A,  as  computed  closest.  The  search  for  an  5,  replacement  is 

using  only  one  neighbour  orbit,  and  52,  which  is  conducted  on  the  principle  that  the  nearest 

computed  using  always  the  second  choice  of  neighbour  is  selected,  subject  to  the  constraint 
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Fig.  6.  Diagram  illustrating  the  angular  constraints  that  con¬ 
trol  the  process  of  selecting  replacement  neighbour  orbits 
when  the  maximum  separation  is  exceeded.  Notation:  p\ 
current  fiducial  orbit  point;  n^,n,:  current  points  on  the  first 
and  second  nearby  orbits  at  the  time  when  replacement  is 
required;  n[,  n'^:  points  found  on  two  candidate  replacements 
orbits  by  successive  application  of  a  box-search  algorithm. 

that  the  angle  0,  (see  fig.  6)  between  the  vector 
to  the  last  point  used  on  the  old  orbit  and  the 
first  used  on  the  new  orbit  not  exceed  a  preset 
maximum.  This  ensures  that  the  computation 
always  tracks  a  measure  of  the  same  Lyapunov 
exponent  through  the  dataset.  On  occasion  the 
local  sparseness  is  such  that  no  replacement 
satisfying  this  angular  criterion  can  be  found 
within  the  maximum  allowed  distance  for  a 
neighbour,  and  in  this  case  the  angular  criterion 
is  relaxed.  These  search  principles  are  as  estab¬ 
lished  by  Wolf  et  al.  [8].  The  S2  replacement  is 
then  found  on  similar  principles,  minimising  the 
distance  to  the  new  neighbour  while  controlling 
the  angle  flj2  between  the  vectors  from  the  cur¬ 
rent  point  on  the  fiducial  orbit  to  the  current 
point  on  each  of  the  replacement  orbits.  Normal¬ 
ly  and  On  are  limited  to  30°  or  less,  but  the 
three  limit  parameters  are  chosen  so  as  to  ensure 
that  the  percentage  of  instances  when  the  angu¬ 
lar  criteria  have  to  be  relaxed  is  kept  small. 

5,  and  S2  can  also  be  treated  as  two  linear  data 
arrays  and  cn  s-correlated.  The  zero-delay 
cross-correlation  has  the  form 

_  £[(5,-5;)(52-^)] 

CC12 - - » 


where  the  numerator  is  the  covariance  of  5,  and 
5,,  and  and  <t,  are  the  standard  deviations  of 
5,  and  5,  respectively.  The  single  number  thus 
obtained  characterises  the  extent  to  which  the 
data  are  noise-affected.  This  can  be  demon¬ 
strated  as  in  fig.  7  by  observing  how  the  normal¬ 
ised  cross-correlation  between  the  two  local 
stretching  trains  varies  when  noise  is  added.  At 
low  noise  levels,  cc,,  is  like  A,  a  characteristic 
measure  01  the  attractor.  For  the  Rossler  attrac¬ 
tor  the  asymptotic  level  of  cross-correlation  is 
0.64,  whereas  for  the  Lorenz  it  is  0.69,  and  for  a 
continuously  differentiable  periodic  signal  it  is 
unity. 

That  this  measure  of  the  behaviour  of  the 
attractor  contains  different  information  from  A, 
can  be  seen  by  comparing  the  ordering  of  the 
same  three  attractors  by  A,.  Whereas  the  cross¬ 
correlation  established  the  Lorenz  as  situated 
“between”  the  Rossler  and  the  sine-wave,  A,  has 
the  Rossler  in  the  middle  of  the  three. 

In  the  noise-afflicted  region,  the  various  at¬ 
tractors  examined  all  suffer  reducing  cross-corre¬ 
lation,  along  what  looks  to  be  a  common  or 
universal  path.  The  aperiodic  collapsible-tube 
data  subscribe  to  this  path,  as  shown  in  fig.  8, 
indicating  that  pre-processing  by  some  means 
such  as  singular  system  analysis  is  required  be¬ 
fore  the  attractor  can  be  characterised  further. 

To  test  whether  an  asymptotic  cc, 2-value  could 
be  derived  from  noisy  data  using  SVD,  we  added 
that  amplitude  of  noise  to  the  Rossler  attractor 
which  gave  approximately  the  noise-to-signal 
ratio  characteristic  of  the  experiments,  i.e. 
log  10(71/5)  slightly  greater  than  -2.  Then  we 
applied  singular  value  decomposition  to  the  re¬ 
sulting  x-variable  time  series  and  reconstructed 
the  attractor  in  three  dimensions  using  the  first 
three  singular  vectors.  Proceeding  as  described 
above,  we  computed  the  zero-lag  cross-correla¬ 
tion  of  the  5,  and  Sj  time  series  and  compared  it 
with  the  value  found  for  the  noise-free  Rossler 
system  (see  the  left-hand  side  of  fig.  9a).  The 
estimate  found  after  using  SVD  to  remove  noise 
is  close  to  that  computed  for  the  original  noise- 
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o  Sine  waves 


Fig.  7.  The  log(noise-to-signal  ratio)  versus  the  zero-lag  normalised  cross-correlation  cc,^  of  the  5,  and  S,  time  series  for  (O)  a 
sine  wave,  (O)  the  unforced  van  der  Pol  attractor  (<i»  =  1,  a  =  10),  (♦)  Shaw’s  forced  Van  der  Pol  system  =  1.57,  A  =  0.25). 
(■)  the  Rossler  attractor  as  in  fig.  5,  (A)  the  Lorenz  attiactor  as  in  fig.  4,  and  (▼)  the  double  potential  well  system  (w  =  0.83, 
/=  0.16,  8  =  0.1),  all  with  noise  added  as  in  fig.  2.  Note  that  the  two  periodic  systems  yield  markedly  higher  cross-correlation  than 
the  chaotic  systems. 


Fig.  8.  As  fig.  7,  but  with  the  addition  of  cc,2  calculations  for  several  different  experimental  time  series  from  the  collapsible-tube 
system. 
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free  system.  Increasing  amounts  of  noise  were 
then  added  once  more  to  the  SVD  reconstruc¬ 
tion,  and  new  cc,2-values  were  computed,  giving 
the  curve  delineated  by  open  symbols.  This 
curve  was  found  to  follow  closely  the  cc,2-values 
given  by  the  original  attractor  with  added  noise 
(filled  symbols). 

Figs.  9b-9d  show  the  results  of  applying  this 
procedure  to  three  other  equation  systems.  Simi¬ 
lar  behaviour  is  apparent,  encouraging  us  to 
believe  that  the  procedure  is  viable  for  use  on 
experimental  data  where  no  noise-free  value  is 
available.  The  outcome  of  treating  examples  of 
the  experimental  data  similarly  is  shown  in  figs. 
9e-9f.  The  value  of  cross-correlation  found  with 
the  aid  of  SVD  at  low  noise-to-signal  ratio  can 
be  regarded  as  an  estimate  of  the  asymptotic 
value  to  which  the  system  would  tend  in  the 
absence  of  instrumental  noise  and  high-dimen¬ 
sional  dynamical  influences. 

It  may  reasonably  be  asked  how  sensitive  the 
value  of  cc,2  is  to  variations  in  the  parameters 
which  are  specified  to  the  computation  of  A,  by 
the  method  of  Wolf  et  al.  [8]  and  which  are  also 
used  in  the  derivation  of  cCi2.  These  include  the 
embedding  dimension,  the  delay  used  to  produce 
the  phase  portraits,  and  internal  parameters 
specifying  limits  on  the  size  of  L(t),  the  distance 
between  the  fiducial  and  neighbour  orbits,  and 
the  angular  deviation  allowed  at  the  time  of 
replacement.  In  addition  the  cc, 2-calculation  in¬ 
troduces  further  parameters  such  as  ^^2.  Fig.  10 
shows  for  the  Rossler  attractor  how  cc,2  varies 
with  some  of  the  parameters.  Whereas  we  find 
that  the  behaviour  of  A,  as  a  function  of  each 
parameter  is  somewhat  unpredictable,  and  not 
always  apparently  systematic,  cc^  consistently 
reaches  a  maximum  in  the  region  which  by  other 
criteria  represents  a  physically  or  numerically 
reasonable  choice  for  that  parameter.  Calcula¬ 
tions  on  other  equation  systems  and  on  collaps¬ 
ible  tube  data  show  similar  behaviour.  This  sug¬ 
gests  that  cc,2  may  have  a  further  use,  as  a  way 
of  determining  the  optimal  or  most  appropriate 


choice  of  the  computational  parameters  involved 
in  the  calculation  both  of  cc.j  itself  and  of  A,. 
Further  work  is  under  way  to  determine  how 
reliable  this  idea  is. 


5.  Discussion 

The  notion  of  examining  the  localised  rate  of 
trajectory  divergence  for  an  attractor  in  phase 
space  is  of  course  not  new;  Nese  [10]  reviews  the 
several  ways  in  which  this  idea  has  been  quan¬ 
tified,  and  adds  his  own  analyses  of  the  local 
predictability  of  the  Lorenz  attractor.  The  idea  is 
sufficiently  appealing  that  it  appears  frequently, 
for  recent  instance  in  the  work  of  Caputo  et  al. 
[11].  In  general  the  fact  that  the  divergence  rate 
varies  systematically  around  an  attractor  is  well 
established.  However,  we  are  not  aware  of  previ¬ 
ous  work  in  which  the  extent  of  the  coordination 
of  the  localised  divergence  has  been  examined  as 
a  tool  to  distinguish  via  its  local  dynamics  the 
behaviour  of  a  chaotic  attractor  from  stochastic 
additive  noise  or  extrinsic  high-dimensional 
forcing. 

As  originally  envisaged,  the  notion  of  compar¬ 
ing  the  localised  stretch  around  the  attractor  was 
to  have  involved  the  use  of  a  cloud  of  nearby 
points  rather  than  simply  two.  This  was  aban¬ 
doned  for  pragmatic  reasons.  Whether  for 
reasons  of  data  storage  or  of  stationarity  in  the 
experimental  conditions,  the  recorded  time 
series  cannot  be  extended  arbitrarily.  Conse¬ 
quently  there  is  a  finite  limit  on  the  density  of 
sampling  of  the  phase  space.  Use  of  more  than 
the  minimum  number  of  neighbouring  orbits 
then  necessitates  extending  the  distance  that  de¬ 
fines  “neighbouring”;  this  should  always  be 
minimised.  Secondly,  the  computation  is  most 
economic  when  only  two  neighbours  are  tracked. 

Even  when  it  is  accepted  that  two  neighbours 
will  be  used,  there  remains  the  question  of  how 
exactly  to  compare  the  localised  stretching  calcu- 


Log  (rws)  Log  (n/s) 


( c )  Van  d«r  Pol  ( I )  LG9 


Log  (lyt)  Log  (tvs) 


Fig.  9.  (a)  log(n/£)  versus  cc,^  for  (•)  the  Rossler  system,  time-delay  reconstructed  in  /f’  with  8f  =  0.7s.  At  n/.s  =  ().()2. 
corresponding  to  n/s  for  the  experimental  data  without  additional  noise  (see  fig.  8),  singular  value  decomposition  (window  length 
T,  =  0.49  s)  is  used  to  effect  a  noise-free  reconstruction,  which  is  then  the  basis  of  further  noise  addition  and  cc,,  calculations 
(O).  (b)  As  fig.  9a,  but  for  (A,  A)  the  forced  Brusselator  equations  (A  =  0.4,  B  =  1 ,2,  «  =  0.08,  <u  =  0.91 )  with  8t  =  0. 1  s  and 
T,  =  0.7  s.  (c)  As  fig.  9a,  but  for  (♦,  O)  Shaw's  van  der  Pol  system  as  in  fig.  7  (8t  =  0.1  s,  =  0.7  s).  (d)  As  fig.  9a.  but  for  (▼  .V) 
Rdssler's  four-variable  hyperchaotic  system  [9]  and  therefore  reconstructed  in  /?*.  8f  =  0.1  s  and  t,  =  2.5  s.  (e),  (f)  As  figs.  9a-9c, 
but  for  (■,□)  aperiodic  collapsible-tube  time  series,  and  therefore  lacking  results  for  low  n/s  without  SVD.  For  LD5, 
8f  =  0.0005  s  and  t,  =  0.0065  s.  For  LG9,  8f  =  0.002  s  and  t,  =  0.05  s. 
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lated  between  the  fiducial  orbit  and  each  neigh¬ 
bour.  The  raw  data  are  the  stretch  time  series; 
we  have  opted  to  compare  them  by  cross-correla¬ 
tion,  using  the  correlation  at  zero  lag  as  an  index 
of  the  dynamical  system.  (In  all  cases  we  have 
investigated,  the  cross-correlation  falls  off  rapid¬ 
ly  at  non-zero  lag.)  Alternatives  include  the 
cross-power  spectrum,  which  would  present  in¬ 
formation  equivalent  to  the  cross-correlation  at 
all  lags  in  terms  of  frequency.  We  have  tried 
other  possibilities.  The  normalised  cross-corela¬ 
tion  compares  only  the  shape  and  phase  of  two 
waveforms;  it  does  not  reflect  differences  in  am¬ 
plitude.  Without  normalisation,  the  index  be¬ 
comes  sensitive  to  the  absolute  rates  of  diver¬ 
gence  that  characterise  the  attractor;  this  proper¬ 
ty  is  already  expressed  in  the  Lyapunov  expo¬ 
nent.  We  have  also  computed  a  modified  func¬ 
tion  that  retains  the  normalisation,  yet  decreases 
as  the  amplitude  difference  between  the  two 
input  series  increases.  However,  the  results  are 
not  markedly  affected  by  this  modification. 

The  maximum  value  of  the  index  cc,,,  at  low 
noise-to-signal  ratio  and  with  the  computation 
parameters  optimised  as  in  fig.  10,  appears  on 
the  basis  of  the  testing  so  far  to  be  a  robust 
characteristic  of  a  given  attractor.  However, 
given  that  this  result  is  supported  by  studies  of 
only  a  limited  number  of  numerical  systems, 
apart  from  the  experimental  data  from  the  col¬ 
lapsible  tube,  it  should  be  regarded  as  provision¬ 
al  pending  further  work. 

Since  the  collapsible-tube  data  were  shown  to 
inhabit  the  region  of  the  diagram  of  noise  ratio 
vs.  stretching  cross-correlation  where  the  cross¬ 
correlation  value  depends  on  the  noise  level,  a 
procedure  based  on  the  use  of  singular  value 
decomposition  was  devised  whereby  an  esti¬ 
mated  noise-free  value  could  be  estimated.  This 
value  was  supported,  first  by  the  correspondence 
between  the  decline  of  the  SVD-processed  cross¬ 
correlation  with  added  noise  and  that  of  the  raw 
data,  and  secondly  by  the  similarity  between  the 
correlation-decline  curves  of  SVD-processed  and 
raw  data  for  other  single-centre  attractors. 


Nevertheless  the  putative  noise-free  value  of 
cross-correlation  found  for  the  collapsible-tube 
data  through  the  use  of  SVD  must  be  treated 
with  caution.  Essentially  the  procedure  devised 
is  an  extrapolation.  The  source  of  complicated 
behaviour  in  the  collapsible-tube  data  is  thought 
to  be  largely  the  high-dimensional  dynamics, 
mimicking  noise  of  unknown  colouration  which 
arises  in  conjunction  with  the  signal  itself, 
whereas  the  noise  in  the  equation  systems  is 
additive  and  white.  Furthermore,  the  equation 
systems  are  all  very  iow-dimensional.  These  dif¬ 
ferences  mean  that  it  cannot  be  inferred  with 
complete  certainty  that  the  behaviour  of  the 
cross-correlation  of  the  SVD-processed  collaps¬ 
ible-tube  data  with  added  noise  is  indicative  of 
how  a  putative  noise-less  collapsible-tube  record¬ 
ing  would  behave.  Indeed  the  very  concept  of  a 
noise-free  collapsible-tube  experiment  is  ill- 
defined  if  the  source  of  noise-like  variation  is  the 
high-dimensional  dynamics,  i.e.,  is  inherent  to 
the  physical  process  whereby  the  aperiodic  time 
series  are  generated.  Nevertheless,  to  the  extent 
that  these  uncertanties  permit,  the  procedure 
devised  yields  a  potential  useful  characterisation 
of  the  system. 

In  summary,  an  index  has  been  devised  which 
measures  the  extent  to  which  the  local  diver¬ 
gence  or  stretch  is  organised  on  a  phase-space 
regional  basis.  It  can  be  used  in  at  least  two 
ways.  One  is  as  a  measure  of  the  extent  to  which 
the  data  are  noise -corrupted,  bearing  in  mind 
that  for  the  aperiodic  collapsible-tube  data  the 
greater  part  of  the  “noise”  is  seen  as  a  reflection 
of  high-dimensional  attractor  dynamics.  In  this 
role,  the  cross-correlation  is  used  to  factor  inter¬ 
pretation  of  other  measures  such  as  A,.  The 
second  use  is  as  an  independent  criterion  of  the 
attractor  organisation,  giving  information 
beyond  that  contained  in  the  global  Lyapunov 
exponent  and  in  previously  devised  measures  of 
local  stretch.  A  possible  third  role  is  as  an  objec¬ 
tive  quantitative  criterion  for  the  optimal  choice 
of  computational  parameters  involved  in  the  cal¬ 
culation  of  the  largest  Lyapunov  exponent. 
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A  dedicated  electronic  instrument  for  measuring  fractal  dimension  of  chaotic  systems  called  "dimensiometer"  is 
described.  The  pointwise  correlation  dimension  can  be  obtained  from  experimental  time  series  on  a  real-time  .scale.  This 
technique  employs  a  discrete  sequence  of  the  time  intervals  between  the  successive  intersections  of  a  single  variable  x(() 
and  an  arbitrary  level  c.  The  “dimensiometer”  is  tested  by  means  of  a  simple  electronic  chaos  oscillator.  The  experimental 
results  are  compared  with  the  numerical  calculations  carried  out  for  the  corresponding  dynamical  model. 


1.  Introduction 

Many  nonlinear  physical  systems  exhibit  cha¬ 
otic  temporal  behaviour.  In  recent  years  the 
qualitative  description  of  this  phenomenon  based 
on  the  visual  inspection  of  the  bifurcation  dia¬ 
grams,  Poincare  sections,  power  spectra,  etc., 
has  evolved  to  the  quantitative  characterization 
employing  dimensions,  entropies  and  Lyapunov 
exponents  [1].  In  particular  dimensions  can  be 
used  to  distinguish  between  random  noise  and 
deterministic  temporal  chaos.  In  addition,  this 
quantitative  measure  makes  it  possible  to  follow 
the  evolution  of  a  system  from  one  chaotic  state 
to  another. 

Various  digital  methods  have  been  proposed 
for  determining  fractal  [2]  and  integer  [3]  dimen¬ 
sions  of  chaotic  systems  from  time  series  data 
sets. 

Several  attempts  have  been  made  to  develop 
analog  techniques  for  determining  dimensions 
without  the  need  of  digital  computing  equip¬ 
ment.  An  optical  technique  for  measuring  fractal 
dimensions  of  planar  Poincare  maps  has  been 


described  [4].  Also  an  electronic  technique  for 
measuring  the  phase  space  dimension  from  cha¬ 
otic  time  series  has  been  suggested  [5]. 

The  instrument  described  in  our  previous 
paper  [5]  is  restricted  to  the  minimum  phase 
space  dimension  or  the  number  of  degrees  of 
freedom,  which  is  an  integer.  Here  we  report  on 
a  modified  electronic  tool  (“dimensiometer”)  for 
estimating  the  fractal  pointwise  correlation  di¬ 
mension. 


2.  Procedure 

Usually  the  method  of  delays  [6,  7]  is  used  to 
reconstruct  the  attractor  in  ^-dimensional  phase 
space  from  a  single  observable  jc(f): 

X^it)  =  {jc(0,  x{t  +7’) . x(t  +  {n-  DT)}  . 

(2.1) 

This  continuous  flow  is  digitized  in  discrete  time 
steps  df  (as  a  rule  df  =  T).  So  one  obtains  a  set 
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of  /t-dimensional  vectors  {Jf"},  /  =  1,2 . N, 

where 

Jtr  =  WiT),  ;c((/ +  I)?-), .  .  .  , 

Ar((/  +  «-l)r)},  (2.2) 

convenient  for  digital  processing. 

When  designing  the  analog  electronic  instru¬ 
ment  we  have  modified  the  procedure  in  the 
following  way.  Let  us  set  the  condition 

xit)  =  c  ,  (2.3) 

where  c  is  an  arbitrary  chosen  fixed  level,  e.g. 
the  mean  (x)  is  a  suitable  value  for  c.  The  given 
condition  can  be  fulfilled  at  certain  time  mo¬ 
ments,  say  t,,  t2, .  ■ .  ,tj, . . . .  Denoting  the  dif¬ 
ferences  /,+i  —  /,  as  7,  we  obtain  a  sequence  of 
the  time  intervals  between  the  successive  inter¬ 
sections  of  the  variable  x{t)  and  the  constant 
level  c,  i.e.  the  “return”  times 

(2.4) 

From  this  sequence  we  construct  m-dimensional 
vectors  {7”},  /  =  1, 2, . .  .  ,  where 

7^  =  {7’, ,7, 7, (2.5) 

Apparently,  the  dimension  of  an  attractor  ob¬ 
tained  from  {7'”}  (the  “7”  approach)  is  one  less 
than  the  dimension  obtained  from  the  actual  set 
{JT"}  (the  "X"  approach): 

dr  =  d„-\.  (2.6) 

This  is  due  to  the  employed  condition  (2.3), 
similarly  to  the  case  of  the  Poincare  sections  (see 
also  the  appendix). 

3.  The  instrument 

A  sketch  of  the  “dimensiometer”  is  shown  in 
fig.  1.  The  comparator  (see  also  fig.  2)  produces 


Fig.  1.  Block  diagram  of  the  instrument  for  the  correlation 
dimension  measurements. 


Fig.  2.  Time  diagram  of  the  signals  in  the  “dimensiometer". 
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a  train  of  short  rf-pulses  corresponding  to  the 
time  instants  when  the  variable  Jt(/)  “crosses’' 
the  given  level  c.  Thus  the  continuous  chaotic 
signal  jc(t)  is  converted  in  to  a  discrete  sequence 
of  pulses  separated  from  each  other  by  chaotical¬ 
ly  varying  intervals .  . .  ,  T,,  7,  +  ,, . .  .  according 
to  (2.4).  These  short  d-pulses  are  extended  in 
the  expander.  The  width  of  the  extended  pulses 
(r-pulses)  can  be  regulated  electronically  in  the 
instrument.  The  width  r  plays  the  role  of  the  size 
of  the  hyperspheres  similar  to  the  scaling  param¬ 
eter  in  the  digital  algorithm  [2]. 

The  multivibrator  set  also  produces  a  sequence 
of  d-pulses  but  with  externally  adjusted  time 
intervals  Tj,  Tj+^, .  . .  ,  (it  must  be  noted, 

that  only  the  “first”  rf-pulse  in  the  multivibrator 
set  is  synchronized  with  the  one  in  the  com¬ 
parator).  These  two  sets  of  pulses,  namely  the 
r-pulses  and  the  d-pulses  (produced  in  the  mul¬ 
tivibrator  set)  are  compared  with  each  other  in 
the  coincidence  scheme  So,  if  the  respec¬ 
tive  sets  of  the  intervals  {T'"}  and  {T”}  (fig.  2 
illustrates  the  case  of  m  =  2)  coincide  in  time 
within  an  accuracy  r  the  scheme  “&”  generates  a 
single  coincidence  pulse  (the  bottom  trace  in  fig. 
2).  The  latter  pulses  are  counted  in  a  pulse 
counter  or  a  frequency  meter. 

The  “dimensiometer”  constructs  m-dimension- 
al  vectors  T”  (2.5)  along  with  m-dimensional 
vectors  (T'"  ^'"),  where 

if'"  =  {r,,r2,...,r„},  (3.1) 

(for  simplicity  r,  =  •  •  •  =  r^  =  r).  In  addition  it 
produces  an  artificial  reference  vector 

r”  =  {T^,  - -  ,  /■  =  const. 

(3.2) 

The  coincidence  pulses  do  appear  if 

Ti-yk-X  ^  ^i  +  k-l  0  ' 

A:  =  1,2, . .  .  ,m  ,  (3.3a) 


or 

+  \  ^l  +  k  I  ■  (3.3b) 

The  total  number  of  the  points  given  by  (2.5) 
appearing  in  the  neighbourhood  of  the  reference 
point  (3.2)  can  be  written  as 

m 

A/7(r)  =  Sn{//(r-(7^.,-,-7,,,  ,)) 

I  k 

x//(7'^.,_,- 7, .,_,)}.  (3.4) 

where  H{x)  is  the  Heaviside  function.  This  num¬ 

ber  behaves  roughly  as  a  power  law 

Af7(r)  ~ ,  (3.5) 

where  d*{m)  saturates  at  large  embedding  di¬ 

mensions  m  to  a  certain  value  d*  called  the 
pointwise  correlation  dimension  (the  sum  in 
(3.4)  is  taken  over  t,  but  not  /). 

The  above  procedure  implies  a  local  analysis 
of  an  attractor  (/  =  const,  r<T).  Obviously, 
other  parts  of  the  attractor,  if  necessary,  can  be 
explored  in  the  same  way  setting  appropriate 
reference  vectors  T"' .  Moreover,  in  contrast  to 
digital  methods  employing  time  series  data  sets 
the  “dimensiometer”  can  easily  choose  the  vec¬ 
tors  77  in  the  neighbourhood  of  the  most  often 
visited  parts  of  the  attractor.  This  is  achieved  by 
adjusting  T'”  in  order  to  select  the  maximum  of 
M  per  time  unit  (such  points  have  the  maximum 
statistical  weight  in  the  averaging  procedure). 


4.  Chaotic  oscillator 

The  “dimensiometer”  was  tested  by  means  of 
the  electronic  chaos  oscillator  [8]  employed  in  an 
earlier  work  [5],  This  nonlinear  oscillator  (fig.  3) 
is  similar  to  the  Van  der  Pol  oscillator  but  with 
an  additional  nonlinear  chain  consisting  of  the 
inductor  L,  and  the  semiconductor  diode  D.  Fig. 
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Fig.  3.  Circuit  diagram  of  the  chaos  oscillator  used  to  test  the 
“dimensiometer” . 


djc/d(?  =  ax  +  y  -  z  ,  dy/d0  =  -x  , 

dz/dd  =  hx  -  bF(z) ,  (4.1) 

where  F{z)  =  ln(^z  +  1)  if  z  2O  and  F(z)  =  kz  if 
z  <  0  is  a  function  representing  the  nonlinear 
current  voltage  characteristic  of  the  semicon¬ 
ductor  diode.  The  dimensionless  variables  and 
parameters  in  (4.1)  are  given  by 


4  illustrates  the  chaotic  behaviour  of  the  oscil¬ 
lator.  The  frequency  range,  in  general,  depends 
on  the  values  of  L,  C  and  L,  (in  our  experiments 
the  central  frequency  of  the  broadband  continu¬ 
ous  spectrum  was  near  10  kHz). 

A  previous  analysis  [5]  indicated  this  oscillator 
to  have  three  degrees  of  freedom,  i.e.  the  mini¬ 
mum  phase  space  dimension  =  3,  sufficient 
to  specify  the  chaotic  state.  So  the  dynamics  of 
this  chaos  oscillator  can  be  described  by  the  set 
of  three  ordinary  differential  equations 


x  =  U,.IU„,  y  =  pIIU,,  z  =  pI,/U„. 

e  =  tlVLC,  (4.2) 

where  is  the  voltage  across  the  capacitor  C 
(the  output  signal),  I  and  /,  the  current  through 
the  inductor  L  and  L,  respectively, 

a  —  pIR,  b=LIL^.  k  =  RJp, 

t/„  =  k^Tle,  p  =  VUC  ,  R^  =  UJI, . 

(4.3) 


Fig.  4.  Illustration  of  the  output  signal  produced  by  the  chaos  oscillator:  (top)  snapshot  of  time  series,  (bottom  left)  phase 
portrait  dx/dt  versus  x  (bottom  right)  Poincare  section  dxidt  versus  /  jt  dt  at  r  =  0. 
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Fig.  5.  Numerical  results  for  model  (4.1)  with  0  =  0.91, 
6  =  10,  it  =  500;  (top)  snapshot  of  time  series,  (bottom  left) 
phase  portrait,  (bottom  right)  Poincare  section. 


where  R  is  the  modulus  of  the  negative  resist¬ 
ance  contributed  to  the  LC-contour  by  the  am¬ 
plifier  and  the  positive  feedback,  k^,  T  and  e 
denote  the  Boltzmann  constant,  the  temperature 
and  the  electron  charge,  /,  is  the  saturation 
current  of  the  diode. 

The  results  of  the  numerical  integration  (fig. 
5)  exhibit  a  good  agreement  of  the  model  (4.1) 
with  the  experimental  results  (fig.  4). 


5.  Correlation  dimensioa 

5.1.  Experimental  results 


InM 


Fig.  6.  (a)  Log-log  plot  of  the  pointwise  correlation  integral 
M  versus  r  as  measured  from  chaotic  oscillations  for  different 
embedding  dimensions  ( 1 )  m  =  1.  (2)  m  =  2.  (b)  Correlation 
exponent  d'(m)  as  a  function  of  m  for  different  types  of 
signals  (I)  random  noise  (no  saturation).  (2)  deterministic 
chaos  (the  saturated  value  d*=0.82).  (3)  periodic  oscilla¬ 
tions  (the  saturated  value  d*  =  0).  r  =  0. 


(fig.  4).  This  is  the  case  of  the  so-called  “smair' 
strange  attractor. 

5.2.  Numerical  results 

The  pointwise  correlation  dimension  for  model 
(4.1)  was  calculated  from  the  variable  x(t)  em¬ 
ploying  the  T  approach  as  in  the  case  of  the 
“dimensiometer” .  An  algorithm  similar  to  (3.4) 
was  used: 


M7(r)  =  En//(r-|7',.,_,  -  r, .*_,!)  (5.1) 


The  results  obtained  by  means  of  the  “di- 
mensiometer”  are  presented  in  fig.  6.  The  coinci¬ 
dence  rate  was  high  enough,  e.g.  ~3  kHz  for 
m  =  2,  r  =  Tq,  so  the  pulse  counter  operated  in 
the  frequency  meter  mode.  Thus  the  values  of  M 
in  fig.  6a  were  taken  per  1  s.  The  scaling  law  of 
A/(r)  seems  to  be  in  satisfactory  agreement  with 
(3.5).  In  fig.  6b  along  with  the  main  result  (2) 
some  other  measurements,  namely  random  noise 
analysis  (1)  and  periodic  oscillations  treatment 
(3),  are  presented  for  comparison.  The  low  value 
of  the  correlation  dimension  (d*  =  0.82,  i.e. 
d*  <  1)  obtained  in  the  experiment  conform  with 
the  quasi-linear  structure  of  the  Poincare  section 
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Fig.  7.  (a)  Log-log  plot  of  the  pointwise  correlation  integral 
M  versus  r  obtained  from  the  single  observable  x  for  model 
(4.1)  with  different  embedding  dimensions  (1)  m  =  1.  (2)  2, 
(3)  3.  (b)  Corelation  exponent  d*  (m)  as  a  function  of  m  (the 
saturated  value  d*  =  (1.95).  c  =  0. 
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(N  =  14000).  The  coordinates  of  the  three- 
dimensional  reference  vector  tJ  were  the  follow¬ 
ing,  r^  =  1.53,  r^>,  =0.94,  T^,,  =  2.70. 

The  results  are  presented  in  figs.  7a,  7b. 

6.  Condurion 

An  analog  technique  has  been  achieved  for  the 
quantitative  analysis  of  chaotic  time  series  from 
nonlinear  systems.  This  electronic  instrument 
(“dimensiometer”)  enables  us  to  estimate  the 
pointwise  correlation  dimension  from  a  single 
observable.  The  real  time  operating  mode  is  a 
valuable  feature  of  its  performance. 

Ai^ndix 

The  purpose  of  this  appendix  is  to  illustrate 
relation  (2.6),  i.e.  to  compare  the  T  and  the  X 
approaches.  Numerical  calculations  have  been 
performed  for  model  (4.1)  with  the  control  pa¬ 
rameters  given  in  fig.  5.  The  global  correlation 
dimensions  have  been  computed  according  to 
ref.  [2], 

\  JV  n 

C"(r)  =  EE  n  H(r  -IT,,,.,  - 

/*‘i  k 

~  ,  (Al) 

N  N  n 

C"(r)  =  EE  n  Hir  -  -  Jr,,*_,|) 

i^i  k 

~  .  (A2) 

The  results  are  shown  in  figs.  8a,  8b  and  figs.  9a, 
9b,  respectively.  This  is  an  evidence  for  dj  = 
dx-l. 

The  small  value  of  the  dimension  d  =  1.92,  i.e. 
\<d<2,  may  seem  to  be  puzzling  in  the  case  of 
the  three-dimensional  chaotic  flow.  However,  it 
must  be  emphasized  that  the  value  of  the  dimen¬ 
sion  is  obtained  as  a  result  of  the  averaging 
procedure  all  over  the  attractor  (double  sum  in 


InC 


Fig.  8.  (a)  Log-log  plot  of  the  correlation  integral  C  (global) 
versus  r  for  model  (4.1)  obtained  in  the  T  approach  with 
different  embedding  dimensions  (1)  n  =  1,  (2)  2,  (3)  3.  (4)  4. 
(b)  Corelation  exponent  d(n)  as  a  function  of  n  (the  satu¬ 
rated  value  d,  =  0.92).  c  =  0. 


InC 


Fig.  9.  The  same  as  in  fig.  8  but  calculated  in  the  usual  X 
approach  (the  saturated  value  of  the  correlation  exponent  in 
(b)d,  =  1.92). 


(Al)  and  (A2)).  If  an  attractor  is  of  sufficiently 
simple  topology,  such  as  the  Rossler  attractor,  it 
is  quite  acceptable  that  there  is  a  great  number 
of  points  j  where  the  local  values  d*  <  2,  except 
for  a  small  fraction  of  points  j  in  the  sites  of  the 
intersection  between  the  flat  sheets  where  the 
local  values  2<d*<3.  Consequently,  it  may 
appear  that  the  average  value  1  <  d  <  2,  e.g. 
d=  1.548,  for  the  Rossler  attractor  [9], 
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The  “persistence  of  strain”  tr^  has  been  ported  from  fluid  mechanics  into  nonlinear  dynamics,  as  a  measure  of  the 
geometry  of  attracting  sets  that  reflects  both  stretching  (sensitivity  to  initial  conditions)  and  folding.  We  outline  a  method 
for  estimating  tr^  from  experimental  time  series  and  apply  it  to  electrical  and  mechanical  signals  recorded  from  the  surface 
of  the  heart  during  experimentally  induced  ventricular  fibrillation. 


1.  Introduction 

Chaotic  dynamics  is  characterised  by  a  sen¬ 
sitivity  to  initial  conditions,  whereby  trajectories 
starting  at  neighbouring  points  in  phase  space 
diverge  exponentially  fast.  This  divergence  may 
be  quantified  by  the  Lyapunov  exponent  spec¬ 
trum  of  the  dynamical  system,  where  for  an 
n-dimensional  phase  space  an  infinitesimal  n- 
sphere  of  initial  conditions,  with  the  centre  of 
the  rt-sphere  on  the  attractor,  will  evolve  into  an 
«-ellipsoid.  The  average  growth  rate  of  the  norm 
of  the  jth  principal  axis  a,(0  of  this  n-ellipsoid 
gives  the  ith  Lyapunov  exponent 

A,.Um  j  log, (J^) bits/s. 

and  the  maximal  Lyapunov  exponent  and  the 
entire  spectrum  of  Lyapunov  exponents,  may  be 
estimated  from  an  experimental  time  series 
[1-3]. 


If  one  visualises  phase  space  as  a  physical 
space,  and  trajectories  as  flows,  then  the 
Lyapunov  exponents  give  a  measure  of  how 
volume  in  phase  space  expands  or  contracts  dur¬ 
ing  the  flow:  a  positive  Lyapunov  exponent  is  a 
measure  of  how  the  dynamics  stretches  state 
space  in  the  direction  of  the  eigenvector  associ¬ 
ated  with  the  L  yapunov  exponent.  For  a  trajec¬ 
tory  to  be  confined  to  a  compact  attractor  this 
stretching  in  a  given  direction  must  be  counterac¬ 
ted  by  a  folding  back;  chaotic  motion  is  charac¬ 
terised  by  stretching  and  folding.  The  “per¬ 
sistence  of  strain”,  originally  a  measure  of  the 
balance  between  shear  dominated  and  vorticity 
dominated  flow  in  a  fluid  mechanical  system,  was 
introduced  into  nonlinear  dynamics  by  Dres- 
selhaus  and  Tabor  [4]  as  a  measure  of  both 
stretching  and  folding. 

The  reconstruction  of  an  attractor  from  ex¬ 
perimental  time  series  provides  a  means  of  vis¬ 
ualising  the  geometry  underlying  the  behaviour, 
and  this  geometry  may  be  quantified  by  mea- 
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sures  such  as  the  fractal  dimensions,  Lyapunov 
exponents  and  entropies.  The  persistence  of 
strain  is  a  measure  that  can  be  defined  for  a 
system  of  nonlinear  ODEs,  and  that  provides  a 
means  for  characterising  the  attractor. 

An  autonomous  /i-dimensional  system  of  non¬ 
linear  ODEs 

dx/<it  =  F(x,t) ,  jc(0)  =  X(,,  (1) 

can  be  linearized  near  a  given  point  af(0),  x  - 
jc(0)  -I-  8x(0),  to  yield  a  linear,  but  nonautono- 
mous  (depending  on  Jf(0))  system  of  equation 

d8x(/)  ldt  =  A(jc(0))  8jc(0)  ,  (2) 

where  A(jr(0))  is  the  Jacobian  matrix 

^/.yWO))  =  for  i,  y  =  1, 2,  3, .  . . ,  n  . 

(3) 

The  solution  of  (2)  is 

8x(0  =  |A(jr(0))8jf(0)dt.  (4) 

In  the  tangent  space,  we  can  think  that  5x(0  is 
the  map  of  8x(0)  by  the  tangent  map  matrix 
T‘(x(0)).  That  is 

8x(0  =  r(x(0))8x(0).  (5) 

If  we  divide  the  time  t  into  K  subintervals  of  At, 
according  to  the  chain  rule  [5, 6] 

r(x(0))  =  T'^^'(x(0)  =  T"'(x(A-l)) 
•••T^'(x(l))T^'(x(0)).  (6) 

When  Ar-»0,  TV(0)  =  A(x(/))  At.  If  the 
eigenvalues  of  the  matrix  A  are  complex  conju¬ 
gate  pairs  (o,-  +  ijS,)  the  persistence  of  strain  cr^  is 
defined  as  the  second  trace  of  A  [4]: 

a^=iTrA^=ii(af-)8?).  (7) 

I 


o-‘>0  implies  stretching,  and  cr"<0  implies 
folding.  Since  these  processes  occur  at  different 
parts  of  the  attractor,  averaging  cr"  over  the 
whole  trajectory  will  allow  the  two  processes  to 
cancel  each  other.  Stretching  and  folding  can  be 
separated  by 

=  ‘^Re(f)  +  '^10,(1)-  (8) 

where  /  is  an  invariant  set  such  as  an  attractor. 
The  ratio  between  stretching  and  folding  can  be 
used  as  a  characteristic  to  quantify  a  strange 
attractor  [6,  7]. 

y  =  •  (9) 

2.  Algorithm 

We  bear  in  mind  that  the  persistence  of  strain 
cr^  is  calculated  from  the  eigenvalues  of  the 
Jacobian  matrix  A.  For  a  physical  system,  if  its 
dynamical  equations  are  known,  it  is  quite  easy 
to  calculate  the  persistence  of  strain  from  its 
dynamical  equations.  However,  in  practice  we 
are  often  confronted  with  some  physical  systems 
whose  dynamical  equations  are  unclear.  What 
we  can  get  from  such  physical  systems  is  only  a 
sample  of  an  experimental  time  series  of  a  single¬ 
state  variable  obtained  at  regular  and  discrete 
time  intervals. 

We  start  with  a  time  series,  x(t„), 
x(/,), . .  .  ,  x(t„),  with  x(t,)  e  R,  and  t^  =  t„  +  ir, 
t  =  0, 1, 2, . . . ,  n,  where  t  is  the  time  interval. 
We  assume  the  time  series  is  stationary  and  is 
produced  by  a  nonlinear  dynamical  system  with  a 
low-dimensional  attractor.  A  time  series  is  mani¬ 
festly  one-dimensional  and  presents  a  projection 
tt:  (R*^— »IR  from  the  full  state  vector  x(/)ElR‘'*. 

The  time  delay  method  provides  us  with  a 
procedure  [8]  to  reconstruct  the  state  space  by 
embedding  the  time  series  into  a  higher-dimen¬ 
sional  space.  A  vector,  x(r)ElR'",  is  created 


H.  Zhang  et  al.  /  Persistence  of  strain  from  heart  recordings 


491 


x(0  =  (40.  x{t  -  t),  .  .  .  ,  x(t  -  {m  -  l)r)  .  (10) 

Here  jc(0  can  be  thought  as  the  mapping  v: 

from  the  full  state  space  x(t)ElR''^  to 
the  reconstructed  state  space  x(t)  E  R™.  Takens 
[9]  and  Mane  [10]  have  proved  that  when  m  > 
2D  +  1,  where  D  is  the  Hausdorff  dimension  of 
the  attractor,  the  embedding  procedure  provides 
a  topologically  equivalent  reconstruction  of  the 
attractor  from  an  1-D  time  series. 

Now  let  us  examine  the  evolution  of  the  vec¬ 
tors  on  the  reconstructed  attractor.  Mathemati¬ 
cally  we  can  think  that  the  evolution  of  the 
vector  on  the  attractor  is  a  flow  produced  by  the 
tangent  map  T^'(jc,  )  mapping  the  state  to 
The  matrix  of  the  linearised  flow  T^'(->^i)  can  be 
approximated  from  a  single  trajectory  by  using 
the  structure  of  the  reconstructed  attractor.  This 
is  done  by  tracing  the  time  evolution  of  the 
difference  vectors  between  jc,  and  an  other  point 
Xj  of  the  same  trajectory  on  the  attractor,  which 
are  the  e-distance  neighbours  of  Xj.  Centred  at 
point  jt,,  N  e-distance  neighbours  consist  of  a  set 
of  difference  vectors  x,— where  j  = 
1,2, .  . . ,  N.  With  the  evolution  time  At,  x,  maps 
into  x'j,  while  Xj  maps  into  x',  where  /  = 
1,2, .  . . ,  N,  and  the  initial  N  difference  vectors 
in  the  ball  B{e), 

B(e)  =  {X(  -  x^.|||x,.  -  xjl  <  e}  ,  (11) 

evolve  into  N  difference  vectors  in  the  ellipsoid 
ball 

5(£')  =  {T"'(xJ-T"'(Xy)}.  (12) 

If  we  denote  the  N  difference  vectors  in  the  ball 
B(e)  and  fl(e')  by  {/ |  /  =  1, 2, . .  . ,  N},  and 
{z'  I  /  =  1, 2, .  . . ,  N)  individually,  an  N  x  N  map 
matrix  Ty  is  determined,  such  that  the  average  of 
the  squared  error  norm  of  all  the  maps  {y'^z'} 
takes  a  minimum 

niin5  =  min 2  ||z' -y'lr  •  (13) 


This  equation  can  be  solved  for  the  matrix 

Ty  =  CV'  with  ((V))^, 

1  ^ 

=  y[y‘, , 

((C)),/ =  ^  s  •  (14) 

If  the  distance  e  and  the  time  interval  are  small 
enough,  the  matrix  Ty  can  be  used  as  a  good 
approximation  of  the  tangent  map  T‘^'(jcJ. 

We  calculate  the  persistence  of  strain  by  the 
following  procedures: 

(i)  Reconstruct  the  dynamical  attractor  from 
experimental  time  series  in  the  phase  space 
by  the  time  delay  method. 

(ii)  Divide  the  evolution  of  the  trajectory  into  a 
series  of  small  time  intervals,  and  obtain  the 
tangent  map  in  this  time  interval  by  a  least- 
squares  fit. 

(iii)  Calculate  the  persistence  of  the  strain  from 
the  tangent  map. 

(iv)  Compute  the  average  of  cr"  obtained  from 
each  time  interval,  and  then  calculate 
and 

2.1.  Calibration 

The  persistence  of  the  strain,  cr^  estimated 
from  an  experimental  time  series  contains  some 
information  about  the  system.  For  a  chaotic 
motion,  cr^  oscillates  between  a  positive  and 
negative  magnitude.  However,  <7‘  provides  more 
valuable  information  than  The  series  esti¬ 
mated  from  a  pseudo  time  series  obtained  by 
numerical  solution  of  a  system’s  equations  differs 
from  what  we  calculate  from  system’s  equations 
directly.  Figure  1  :  Iiows  the  cr^  series  for  the 
Lorenz  equations  with  parameters  R  =  45.92, 
cr  =  16,  &  =  4  calculated  from  equations  and  from 
a  pseudo  time  series.  The  difference  between 
ihem  are  drastic. 

The  reason  that  the  series  calculated  from  a 
pseudo  time  series  differs  from  calculated  results 
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Fig.  1,  The  series  of  persistence  of  strain  for  Lorenz  equa¬ 
tions  with  parameters  R  =  45.92,  <t  =  4,  6  =  10.  (a)  The  series 
of  persistence  of  strain  calculated  from  equation,  (b)  The 
series  of  persistence  of  strain  calculated  from  a  pseudo  time 
series. 

is  that  in  the  computation  process  we  choose  the 
N  e-distance  neighbours  with  a  randomly  chosen 
probability.  The  error  is  produced  from  the  N 
neighbours  being  nonuniformly  chosen  in  N  dif¬ 
ferent  directions.  If  we  take  the  advantage  of  the 
SVD  [11]  method  to  extract  an  orthogonal  basis 
at  the  reference  point  of  the  trajectory,  and  then 
project  the  ball  composed  of  by  the  N  difference 
vectors  to  the  basis,  by  tracing  the  evolution  of 
the  projected  ball,  the  procedure  would  be  im¬ 
proved.  Meanwhile,  the  parameters  such  as  em¬ 
bedding  dimension  and  sampling  time  interval 
also  influence  the  estimated  result. 

However,  after  averaging  over  the  whole  tra¬ 
jectory  (50000  points),  from  a  pseudo  time 
series,  we  get  y  =  1.75,  =  10.35,  =5.91, 

all  of  which  are  quite  similar  to  the  calculated 
results  from  equations,  of  which  y  =  1.68,  = 

10.20,  =  6.05. 


3.  Case  stu.iy:  ventricular  fibrillation 


In  ventricular  fibrillation  the  ventricles  no 
longer  contract  synchronously,  but  different 
parts  of  the  ventricular  muscle  contract  at  differ¬ 
ent  times,  which  redistributes  blood  within  the 
ventricular  chamber.  The  effect  of  ventricular 
fibrillation  is  that  the  pump  function  of  the  heart 
fails  and  systemic  blood  flow,  including  that 
through  the  coronary  arteries  of  the  heart  itself, 
stops.  Consequently  the  metabolic  status  of  the 
ventricles  deteriorates  progressively  and  so  do 
the  electromechanical  signals.  A  continuous  sam¬ 
ple  of  signals  which  are  stable  for  the  duration 
required  for  our  analysis  is  not  possible.  One 
way  of  overcoming  this  difficulty  is  to  induce 
fibrillation  while  coronary  artery  flow  is  main¬ 
tained.  Retrograde  perfusion  of  the  coronary 
arteries  in  an  isolated  heart  was  chosen  for  main¬ 
taining  a  stable  biological  microenvironment 
while  collecting  data  during  ventricular  fibril¬ 
lation. 

New  Zealand  White  rabbits,  weighing  about 
2.5  kg  were  injected  with  a  mixture  of  pentobar¬ 
bitone  sodium  (200  mg/ml  Expiral)  heparin 
(Monoparin  1000  units/ml)  into  a  marginal  ear 
vein  until  no  corneal  reflex  remained.  The  heart 
was  exposed,  removed,  and  immediately  plunged 
into  a  mixture  of  ice-cold  Krebs  solution  contain¬ 
ing  heparin  and  massaged  to  remove  any  excess 
blood  from  the  coronary  circulation,  thereby  re¬ 
ducing  clot  formation.  The  heart  was  then  at¬ 
tached  to  the  Langendorff  apparatus  via  the 
aorta  for  retrograde  perfusion  of  the  coronary 
arteries.  The  Krebs  solution  was  first  passed 
through  a  filter  to  remove  invisible  particulate 
matter  and  heated  by  a  water  jacket  to  37°C. 

The  Krebs  solution  was  bubbled  with  95% 
0-^5%  COt  mixture.  The  heart  was  paced,  via 
electrodes  on  the  atrium  or  ventricle,  faster  than 
intrinsic  rate  by  stimulating  pulses  which  were 
about  1.5-2  times  threshold. 

Monophasic  action  potentials  (MAPs)  were 
recorded  using  two  suction  electrodes  [12]  placed 
on  the  ventricular  epicardium,  with  care  being 
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taken  not  to  place  them  over  major  blood  ves¬ 
sels.  A  tripodal  device  operated  by  vacuum  and 
with  strain  gauges  attached  was  sucked  on  to  the 
epicardium.  This  provided  an  analogue  signal  of 
epicardiai  segment  motion  that  provides  an  index 
of  length  changes  in  the  cardiac  muscle  fibers 
[12]. 

The  heart  was  paced  progressively  faster  until 
it  could  no  longer  adequately  follow  the  stimuli, 
and  then  slowed  back  until  it  could.  This  rate, 
about  150  ms  between  beats,  was  maintained. 
After  a  period  ranging  between  2-5  min  elec¬ 
tromechanical  signals  went  through  episodes  of 
regular  alternans  (alternate  large  and  small  sig¬ 
nals),  tachycardia  (heart  rate  spontaneously  very 
fast  but  regular),  and  disorganised  rhythm.  This 
ended  in  a  period  of  maintained  ventricular  fib¬ 


rillation,  where  the  ventricular  muscle  showed 
writhing  movements,  with  irregular  electrical  and 
mechanical  activity  at  any  point  on  the  surface  of 
the  heart,  and  no  synchronisation  of  activity 
between  distant  points.  During  the  latter  part  of 
the  protocol  the  pH  of  the  coronary  effluent 
dropped  from  around  7.3  to  6.8. 

We  have  used  the  method  described  in  section 
2  to  analyze  electrical  and  mechanical  recordings 
from  the  ventricular  epicardiai  surface  during 
this  period  of  maintained  ventricular  fibrillation. 
The  result  of  analysing  the  electrical  signals  are 
presented  in  fig.  2. 

Figure  2a  shows  0.4  seconds  of  the  electrical 
activity:  irregularity  is  apparent  both  in  the  size 
of  the  spike-like  components  of  the  signals 
(which  correspond  to  ventricular  action  poten- 


Fig.  2.  The  characteristics  for  electrical  ECG  data  recorded  from  a  rabbit  heart  during  ventricular  fibrillation  with  sampling 
frequency  1000  Hz.  (a)  The  irregular  time  series  for  2  seconds  time,  (b)  The  attractor  portrait  projected  to  the  plane  associated 
with  the  first  two  principle  eigenvectors  after  SVD.  (c)  The  Lyapunov  exponents  spectrum,  assumed  embedding  dimension  6  (d) 
The  series  of  persistence  of  strain,  assumed  embedding  dimension  6. 
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tials)  and  their  timing.  The  projection  of  the 
reconstructed  attractor  into  the  plane  formed  by 
the  first  two  principal  eigenvectors,  obtained  by 
singular  value  decomposition,  suggests  chaotic 
behaviour.  The  correlation  dimension  saturates 
at  3.8  at  an  embedding  dimension  of  6.  Estima¬ 
tion  of  the  six  Lyapunov  exponents  converge  to 
their  limits  value  after  the  first  second  of  the 
data:  the  two  largest  are  clearly  positive.  Thus 
on  the  criteria  of  a  low,  noninteger  attractor 
dimension  and  positive  Lyapunov  exponent  the 
signal  is  chaotic,  and  represents  a  sample  of 
motion  on  a  strange  attractor  generated  by 
stretching  and  folding.  The  stretching  and  fold¬ 
ing  of  the  trajectory  on  the  attractor  is  illustrated 
by  the  plot  of  a^(t)  in  fig.  2d,  where  if  o-^>0. 


stretching  of  the  trajectory  in  state  space  pre¬ 
dominates  over  folding,  and  if  a'  <  0,  folding 
predominates  over  stretching. 

The  same  analysis  has  been  applied  to  the 
mechanical  signals  recorded  simultaneously  dur¬ 
ing  ventricular  fibrillation  in  fig.  3.  Once  again, 
on  the  criteria  of  a  noninteger  attractor  dimen¬ 
sion  (Dj  =  3.5)  and  positive  Lyapunov  exponent 
the  signal  is  clearly  chaotic.  The  Lyapunov  expo¬ 
nents  are  given  in  table  1. 

Although  there  are  quantitative  differences 
(due  to  the  different  bandwidth  and  signal  to 
noise  ratio  of  the  two  signals)  the  qualitative 
features  are  similar:  their  results  can  be  used  to 
identify  (and  quantify)  chaos  in  these  two  func¬ 
tionally  related  signals,  but  not  to  distinguish 


Fig.  3.  The  characteristics  for  mechanical  ECG  data  recorded  simultaneously  from  a  rabbit  heart  during  ventricular  fibrillation 
with  sampling  frequency  1000  Hz.  (a)  The  irregular  time  series  for  2  seconds  (b)  The  attractor  portrait  projected  to  the  plane 
associated  with  the  first  two  principle  eigenvectors  after  SVD.  (c)  The  Lyapunov  exponents  spectrum,  assumed  embedding 
dimension  6.  (d)  The  series  of  persistence  of  strain,  assumed  embedding  dimension  6. 
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Table  1 

The  Lyapunov  exponents  spectrum  for  electrical  and  mech¬ 
anical  signals  recorded  from  a  rabbit  heart  during  ventricular 
fibrillation.  The  assumed  embedding  dimension  is  6.  The 
sampling  frequency  is  1000  Hz. 


A, 

A. 

A, 

A4  A5  A^ 

electrical 

mechanical 

3.1 

1.7 

1.0 

0.46 

-0.29 

-0.92 

-1.7  -3.5  -8.3 
-2.0  -3.2  -6.9 

Table  2 

The  persistence  of  strain  for  electrical  and  mechanical  signals 
recorded  from  a  rabbit  heart  during  ventricular  fibrillation. 
The  assumed  embedding  dimension  is  6.  The  sampling  fre¬ 
quency  is  1000  Hz. 


electrical 

9.8 

1.4 

6.5 

mechanical 

7.9 

0.7 

10.1 

between  them  or  different  models  for  their  func¬ 
tional  interaction.  The  plot  of  o-^(f)  for  the 
mechanical  signal  in  hg.  3d  is  different  from  that 
of  the  electrical  signal:  <T\t)>0  for  most  of  the 
time  series.  Stretching  of  the  trajectory  in  state 
space  predominates  over  folding. 

These  different  roles  of  stretching  and  folding 
of  the  trajectory  in  state  space  can  be  quantified 
by  the  estimation  of  <7^^  and  a■^„  provided  in 
table  2. 

The  ratio  of  the  stretching  to  folding  is  clearly 
much  greater  for  the  mechanical  signal. 

4.  Conclusions 

This  is  an  on-going  discussion  as  to  whether 
ventricular  fibrillation  (in  clinical  practice,  or 
induced  in  animal,  or  isolated  heart,  experi¬ 
ments)  is  an  example  of  chaotic  or  spatio-tempo¬ 
ral  chaotic  behaviour,  or  is  irregular  behaviour 
generated  by  a  complicated,  highly  structured 
system;  see  papers  in  refs.  [13-16]. 

A  positive  maximum  estimated  Lyapunov 
exponent  is  a  characteristic  of  a  chaotic  process, 
and  is  found  both  for  time  series  generated  by  a 
dynamical  system  (modeled  by  ODEs  or  a  map) 
and  for  activity  at  a  point  during  spatio-tempo¬ 
ral  chaos. 


Both  the  electrical  and  mechanical  signals  may 
be  identified  as  chaotic,  on  the  basis  of  their 
positive  maximal  estimated  Lyapunov  expo¬ 
nents.  This  identification  of  chaos  in  both  the 
electrical  and  mechanical  activity  is  consistent 
with  the  view  that  electrical  activity  triggers  the 
mechanical  activity,  and  so  if  the  first  is  chaotic, 
then  so  is  the  second.  The  normalised  ratio  y 
distinguishes  between  the  two  chaotic  signals. 
The  different  behaviours  of  y  distinguishes  the 
two  signals,  and  so  provides  a  measure  of  dis¬ 
tinguishing  between  models  of  mechanical-elec¬ 
tric  interaction  in  cardiac  tissue.  The  types  of 
physiological  model  for  the  mechanisms  underly¬ 
ing  ventricular  fibrillation  are  (a)  local  chaotic 
dynamics  or  the  result  of  propagating,  re-entrant 
waves  (an  example  of  spatio-temporal  chaos) 

[17]  and  (b)  electrical  activity  triggers  mechanical 
activity,  or  electrical  activity  triggers  mechanical 
activity,  which  in  turn  alters  electrical  activity 

[18] . 

All  combination  of  these  two  types  of  model 
can  generate  irregular  fluctuations  in  mechanical 
and  electrical  activity;  the  persistence  of  strain 
measure  provides  additional  constraints  on  the 
behaviours  of  the  models  and  so  may  be  of  value 
in  distinguishing  between  the  different  classes  of 
model. 
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